Skip to Main Content

Breadcrumb

Dev Live Dev Intro

We are celebrating Developers at AskTOM. We welcome Developers of all levels of experience to join us at our FREE Developer Live events coming in August and September. Just click on the left to register today! If you are brand new to Database Technology, then we also have got you covered. Just click on the right for your comprehensive FREE training program to kick start your Oracle Database Development journey!

Question and Answer

Tom Kyte

Thanks for the question, John.

Asked: October 21, 2002 - 5:01 pm UTC

Answered by: Tom Kyte - Last updated: October 17, 2008 - 9:07 pm UTC

Category: Developer - Version: 9.2

Viewed 50K+ times! This question is

You Asked

tom,
i need to randomely select 4 employees from one of our core tables

SQL> desc people
Name Null? Type
----------------------------------------- -------- ----------------------------
TYPE_ID NOT NULL VARCHAR2(1)
EMP_ID NOT NULL VARCHAR2(10)
ID NUMBER(38)
IMPS_OFFICE_ID VARCHAR2(6)
REGION VARCHAR2(10)

from the above table i need to randomely select 4 employees(i.e values of the column "id")
based on some where conditions

where conditions are like we have to choose four employees who belong to a department and who are active
and who have agreed to be picked up randomely, for this the above table is joined with some other tables.



1.i see like based on the conditions i get the row count with all the join condtions

2.then use the same above conditions and use row_number() analytical function,dbms_random and count found
in step 1 to get the ids.

is the above correct, or any suggestions and better ways.

would really appreciate your reply

and we said...

scott@ORA920.US.ORACLE.COM> select *
2 from ( select empno, ename
3 from emp
4 where ename like '%'
5 order by dbms_random.value )
6 where rownum <= 4
7 /

EMPNO ENAME
---------- ----------
7934 miller
7902 ford
7900 james
7839 king

scott@ORA920.US.ORACLE.COM> /

EMPNO ENAME
---------- ----------
7521 ward
7844 turner
7788 scott
7698 blake

scott@ORA920.US.ORACLE.COM> /

EMPNO ENAME
---------- ----------
7900 james
7844 turner
7566 jones
7876 adams

scott@ORA920.US.ORACLE.COM>

and you rated our response

  (48 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

thanks, but it is slow

October 21, 2002 - 8:26 pm UTC

Reviewer: John

thanks for that.

in a department we can have at an average 1000 to 5000 employees.

but i run the query in my db, it takes around 8 seconds
this is the autotrace output

1)

SQL> SELECT * FROM ( SELECT * FROM people ORDER BY dbms_random.VALUE ) WHERE ROWNUM <= 4

Execution Plan
----------------------------------------------------------
   0|    |SELECT STATEMENT Optimizer=CHOOSE (Cost=1780 Card=4 Bytes=28|
   1|   0|  COUNT (STOPKEY)                                           |
   2|   1|    VIEW (Cost=1780 Card=88663 Bytes=28904138)              |
   3|   2|      SORT (ORDER BY STOPKEY) (Cost=1780 Card=88663 Bytes=10|
   4|   3|        TABLE ACCESS (FULL) OF 'SID_EM_REF' (Cost=157 Card=8|




Statistics
----------------------------------------------------------
         26  recursive calls
          0  db block gets
       1626  consistent gets
       1351  physical reads
          0  redo size
       1978  bytes sent via SQL*Net to client
        311  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          4  rows processed


but when i do,

2)

SQL> SELECT * FROM (SELECT a.*, ROWNUM rn FROM SID_EM_REF a WHERE ROWNUM <= 562) WHERE rn = 562

Execution Plan
----------------------------------------------------------
   0|    |SELECT STATEMENT Optimizer=CHOOSE (Cost=157 Card=562 Bytes=1
   1|   0|  VIEW (Cost=157 Card=562 Bytes=190518)                     
   2|   1|    COUNT (STOPKEY)                                         
   3|   2|      TABLE ACCESS (FULL) OF 'SID_EM_REF' (Cost=157 Card=886




Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
         16  consistent gets
          0  physical reads
          0  redo size
       1550  bytes sent via SQL*Net to client
        311  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed

so, i think i write a proc and store dbms_random.value in 4 variables and union all the queries simular to 2 replacing 562 with the 4 variables.

any feedback.

because we are going to put this in home page of site.i think 8 seconds is slow.
 

Tom Kyte

Followup  

October 22, 2002 - 7:04 am UTC

generating 5000 random numbers is somewhat CPU intensive.

I've no idea where you were going with that 2cnd query there. Full scanning but stopping at row 562 (not completing the scan) versus full scanning an entirely differently table -- why do you think they would be even remotely the same.


I cannot imagine the usefulness of "showing 4 random records" on a home page. Perhaps if you explain your actual business requirements and constraints -- we could come up with something reasonable.

I don't see what storing 4 values of random numbers and using a "union" would do for you -- I cannot imagine.

October 21, 2002 - 11:19 pm UTC

Reviewer: Bhagat Singh from Delhi,India

Tom is there a way around for oracle version 8.1.6 for this please update us on the same


Bhagat Singh

Tom Kyte

Followup  

October 22, 2002 - 7:09 am UTC

if dbms_random.value isn't working well for you -- see

</code> http://asktom.oracle.com/Misc/Random.html <code>

there I have a random number generator in PLSQL (works identically to the C rand() function)

same technique works.

Works on 8i

October 22, 2002 - 4:08 am UTC

Reviewer: Dave

That also works on 8i, try it

Unique Random Values

October 22, 2002 - 4:10 am UTC

Reviewer: A reader

John, ur gonna have to write some code to generate the 4 unique random numbers that are < count(*) of ur table. Why take the pain... Follow Tom...

For original poster

October 22, 2002 - 4:37 am UTC

Reviewer: Connor McDonald from UK

You might have some joy with the SAMPLE clause to give a smaller/quicker/"semi-random" set and then apply the dbms_random to that.

hth
connor

Tom Kyte

Followup  

October 22, 2002 - 7:24 am UTC

good point -- if there are no joins, just a simple single table query, using sample to get a couple of dozen random observations and then ordering that by random would be speedier.


The trick would be specifying the right "pct" or "block" value in order to get a sample *big* enough.  For example in the following, sample(1) gets 1% of 5000 or about 50.  It might get 0, it might get 100 (very very doubtful BUT the chance exists, it is a "sample" not a precise "get me 50 rows").  So,  you need to make sure the sample size is large enough to get at least 4 rows!

ops$tkyte@ORA817DEV.US.ORACLE.COM> /*
DOC>
DOC>drop table t;
DOC>
DOC>create table t as select * from all_objects where rownum <= 5000;
DOC>*/
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> set timing on
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select object_name, object_id
  2    from ( select object_name, object_id
  3                     from t
  4                    where object_name like '%%'
  5                    order by dbms_random.random )
  6   where rownum <= 4;

OBJECT_NAME                     OBJECT_ID
------------------------------ ----------
/65ddadec__TimeoutImplBase          31405
/a2efb14b_Manager                   30951
/782c5acd_XSDFactory                44004
/b7b4a7fc_BorderUIResourceBeve       7980

Elapsed: 00:00:00.57
ops$tkyte@ORA817DEV.US.ORACLE.COM> /

OBJECT_NAME                     OBJECT_ID
------------------------------ ----------
/8d664d2f_ObjectBuilder             43951
/ae48c769_DynaHashTest2             14703
/860c1d1e_HandlerRegistryClien      30343
/6fbd66b5_BeanContextServicesS       8840

Elapsed: 00:00:00.56
ops$tkyte@ORA817DEV.US.ORACLE.COM> select object_name, object_id
  2    from ( select object_name, object_id
  3                     from t sample ( 1 )
  4                    where object_name like '%%'
  5                    order by dbms_random.random )
  6   where rownum <= 4;

OBJECT_NAME                     OBJECT_ID
------------------------------ ----------
/4ff59575_MetalComboBoxUIMetal       7336
/65fabb7f_ServerInterceptorFac      30355
/7ad01403_schemasHolder             31442
/68f6d056_HTMLDocumentHTMLRead       9498

Elapsed: 00:00:00.01
ops$tkyte@ORA817DEV.US.ORACLE.COM> /

OBJECT_NAME                     OBJECT_ID
------------------------------ ----------
/7e51c6b4_JspBeanFieldInfo          35358
/858cce89_WrappedPlainViewWrap       9450
/19c0b889_NotEmptyHelper            32240
/a995fc24_DropTargetContextPee      10144

Elapsed: 00:00:00.01
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> set timing off



Using sample made this much faster (but the poster must be on a teeny tiny machine if it took them 8 seconds to do 5000 rows) 

October 22, 2002 - 5:26 am UTC

Reviewer: A reader

Dear Tom,
Can you please explain the working of dbms_random.value in the order by clause?

Tom Kyte

Followup  

October 22, 2002 - 7:26 am UTC

umm, well, we order the results by a random number and pick off the first 4 -- getting in effect 4 random rows from the entire result set.

why randomly selecting numbers? -- answer from john

October 22, 2002 - 11:58 am UTC

Reviewer: John

tom,

i am sorry for the confusion.

in the 2nd query i put sid_em_ref is same as people table.

the intention of selecting the 4 random ids from people table is:

when i login to our site, we have to display 4 employee photos who belong my division and who said ok for their photos being viewed by others.

i ran the query you gave in another followup with sample(1) clause. it runs reasonably fast.

but can we use the query with sample caluse for my requirement with no hassles? and i don't know anthing about this sample clause

again sorry for any confusion caused.

Tom Kyte

Followup  

October 22, 2002 - 12:54 pm UTC

if your predicate always returns more then 1000 rows or so, a 1% sample via

sample(1)

should return about 10 rows give or take. Getting 4 from that should be "safe" enough -- especially since it is something as inoccent as "display 4 random pictures". If you get less then 4 -- big deal (and if someone logs a bug on it just say "hit reload and tell me if it happens again -- and then it won't ;)

sample cluase

October 22, 2002 - 12:08 pm UTC

Reviewer: john

sample clause doesn't work for joins.
any work around for that??

Tom Kyte

Followup  

October 22, 2002 - 1:00 pm UTC

join to the sample. Since:

scott@ORA920.US.ORACLE.COM> select *
2 from ( select ename, dname
3 from emp sample(10), dept
4 where emp.deptno = dept.deptno
5 order by dbms_random.value )
6 where rownum <= 4;
where emp.deptno = dept.deptno
*
ERROR at line 4:
ORA-30561: SAMPLE option not allowed in statement with multiple table references

doesn't fly, use the equivalent:

scott@ORA920.US.ORACLE.COM>
scott@ORA920.US.ORACLE.COM>
scott@ORA920.US.ORACLE.COM> select ename, dname
2 from ( select *
3 from ( select ename, deptno
4 from emp sample(50)
5 order by dbms_random.value )
6 where rownum <= 4 ) EMP, dept
7 where emp.deptno = dept.deptno
8 /

ENAME DNAME
---------- --------------
MILLER ACCOUNTING
WARD SALES
KING ACCOUNTING

scott@ORA920.US.ORACLE.COM> /

ENAME DNAME
---------- --------------
ADAMS RESEARCH
TURNER SALES
SCOTT RESEARCH
ALLEN SALES

scott@ORA920.US.ORACLE.COM>

Still not clear...

October 22, 2002 - 12:20 pm UTC

Reviewer: Kashif

Hi Tom,

I'm still a little hazy about how dbms_random works in the order by clause. My understanding was that a number in the order by clause signified the number of the column in the select clause, so 'select * from t order by 6' would order by the 6th column in t. How would the order by work for a number such as 633248193 (last number generated by dbms_random in our database) and so forth?

Secondly, isn't it true that the dbms_random package needs to be initialized with a seed number? I'm assuming you've already initialized it when you call the random function in the order by clause.

Thanks in advance.

Kashif

Tom Kyte

Followup  

October 22, 2002 - 1:02 pm UTC

select * from emp order by sal;


so, when sal = 1300 does that order by the 1300'th column??? (no)

select * from emp order by dbms_random.random;

is just like ordering by SAL.


dbms_random hasn't needed to be initialized for a couple of releases now.

sample caluse

October 22, 2002 - 2:32 pm UTC

Reviewer: John

tom,
thanks for the alternatives given.

will the sample(10) caluse return
1.10 random rows and get the random values or
2.it gives first 10 rows.

if 2 is the case, i get 4 random photos of first 10 employees of that department all the time, which we can't tell random.


and for using sample caluse on joins,
i saw your workaround, but as i mentioned, i am not joining to get the data, i have to join to filter out some rows and get 4 random rows from that filtered rows.

your query is :

select ename, dname
2 from ( select *
3 from ( select ename, deptno
4 from emp sample(50)
5 order by dbms_random.value )
6 where rownum <= 4 ) EMP, dept
7 where emp.deptno = dept.deptno

i want to give something like this:
scott@ORA920.US.ORACLE.COM> select *
2 from ( select ename, dname
3 from emp sample(10), dept
4 where emp.deptno = dept.deptno
and dept.deptno = '123'
5 order by dbms_random.value )
6 where rownum <= 4
where emp.deptno = dept.deptno

thanks

Tom Kyte

Followup  

October 22, 2002 - 3:51 pm UTC

sample(10) is 10%

See the SQL Reference for information on SQL statements!

</code> http://docs.oracle.com/docs/cd/A87860_01/doc/server.817/a85397/state21b.htm#2065954 <code>

sample(10) will return about 10% of the rows randomly.


If what you say is true -- that you join to filter -- then you violate what i said in the first place:

....
good point -- if there are no joins, just a simple single table query, using
sample to get a couple of dozen random observations and then ordering that by
random would be speedier.
...........



You know what -- if you want my opinion -- this seems like a lot of work for a "silly" function. To show four grinning heads "at random". Not sure its what I want my database to spend half of its time doing if you know what I mean...


If you didn't want my opinion, well, guess it is too late for that!


clarification

October 22, 2002 - 3:59 pm UTC

Reviewer: Mark Wooldridge from Vienna, VA.

Tom,

Interesting solution. I am responding to a reply by Kashif of why the dbms_random works.

Doesn't the order by dbms_random do the same as:
select *
from ( select empno, ename, dbms_random.value
from emp
where ename like '%'
order by 3 )
where rownum <= 4

It is confusing that you can specify a number as the order by column or a column name, or function, if the order by is a function, it is applied to each record returned and then ordered by that value. Similar to order by substr(ename, 3).



Tom Kyte

Followup  

October 22, 2002 - 4:22 pm UTC

correct, well said.

sample clause -- joins

October 22, 2002 - 4:23 pm UTC

Reviewer: A reader

to your statement

"If what you say is true -- that you join to filter -- then you violate what i
said in the first place:"

the point here is not whether John is true or you are true.
point here is to find good solution for a requirement(in your opinion which is silly)

well, people have their own requirements which you can never tell silly. because you are being paid to solve their requirements.

Tom Kyte

Followup  

October 22, 2002 - 7:01 pm UTC

I clearly said waaaayyy back when the sample idea was first proposed:

"if you join, no joy for you"

Sorry to have wasted time (his and mine) pursuing something that was not to be (faster via sample)... I thought I was laying it all out from the get go.

I'm not being paid to solve their requirments by the way -- I actually sort of take exception to that.

October 22, 2002 - 4:29 pm UTC

Reviewer: John

i said only:
"
i want to give something like this:
scott@ORA920.US.ORACLE.COM> select *
2 from ( select ename, dname
3 from emp sample(10), dept
4 where emp.deptno = dept.deptno
and dept.deptno = '123'
5 order by dbms_random.value )
6 where rownum <= 4
where emp.deptno = dept.deptno
"

i never said i am true, i only said that i want something like that which i don't know how to do that.

please leave it as it is,
my mistake to ask a silly question.

btw, thanks for answering the sample clause -- i don't know that.


sample clause -- join -- correction

October 22, 2002 - 4:51 pm UTC

Reviewer: A reader

correction to :

"because
you are being paid to solve their requirements. "

"we are being paid to answer business owner's requirements."


Tom Kyte

Followup  

October 22, 2002 - 7:12 pm UTC

And - many times -- if you speak up and say

"well, you know that little thing about displaying 4 random pictures. Well, its cute and all but let me explain to you the resources it would take to accomplish. Now, are you sure you still want it?"


That -- that is our job as well. Perhaps even more important then doing whatever you are told. What sounds trivial to them "4 random pictures based on this complex query" when translated into "it'll take a bigger machine to host" -- might persuade them that its perhaps not a requirement.


It is like disaster recovery -- everyone wants it, but they don't want to pay for it. Except DR isn't something you should skip-- but maybe -- just maybe -- this is.

speak up and say

October 22, 2002 - 7:33 pm UTC

Reviewer: A reader

you mean "speak up and say" to the business owner that your requirement is silly?

that too without knowing why do they really want it.

it may be simple to you to tell us like that, as after all we are asking you the questions, but we can't do that to our clients.

Tom Kyte

Followup  

October 22, 2002 - 9:14 pm UTC

then you are doing your clients a huge, great, large dis-service.

You are not helping them, you are hurting them.

Do you read dilbert (a comic strip)?

The pointy haired guy is the business owner when it comes to technlogy. Rotating 3D heads might look really cool and they'll call it a business requirement but unless you tell them the COST of the 3D rotating head on their display -- they'll just want it.

Tell them the cost of what they are asking for and you might really be surprised at how quickly they become accomidating. Really.

I've seen 99.9999% uptime (business requirement) get rapidly reduced, once they find out the cost and offset that against the cost of being down for an hour. This is NOT ANY DIFFERENT.

randomly generating unique integers in 9iR2

November 11, 2002 - 6:55 pm UTC

Reviewer: John

tom,

is there way to generate unique random integers.

like below

declare
v_random1 integer;
v_random2 integer;
v_random3 integer;
v_random4 integer;
begin
v_random1 := dbms_random.value(1,75);
v_random2 := dbms_random.value(1,75);
v_random3 := dbms_random.value(1,75);
v_random4 := dbms_random.value(1,75);
end;

sometimes i get duplicates, how can i avoid duplicates

Tom Kyte

Followup  

November 11, 2002 - 7:16 pm UTC

Well, "unique random values" is an oxymoron....

You cannot really do it -- maybe something like this tho:


select to_char( dbms_random.value(1,75), '99' )) ||
to_char(seq.nextval,'fm00000000009' ) into v_r
from dual;

use a sequence to make it unique

Alternative

November 12, 2002 - 8:19 am UTC

Reviewer: Paul from UK

Tom,

As a mathematician, I totally agree with you!

If the user means "Can I randomly order the integers 1 .. 75 and select the first 4?", then something like this would do (copying your trick of ordering by DBMS_RANDOM.RANDOM):

SELECT * FROM
(SELECT rownum my_integer
FROM all_objects
WHERE rownum < 76
ORDER BY dbms_random.random)
WHERE ROWNUM < 5
/




Tom Kyte

Followup  

November 12, 2002 - 10:32 am UTC

excellent -- that would do the trick of getting 4 unique random numbers in the range of 1..75 yes.

Using sample with a where clause

November 12, 2002 - 8:47 pm UTC

Reviewer: Sam To from San Mateo, CA

Tom,

I am writing an application that involves randomly selecting a small sample of items in a certain category from a large shopping catalog and then feeding them to an AI engine to automatically generate rules on why these items are classified into that category. All the items are in a single table.

If I use the SAMPLE clause, something like:
SELECT rowid, item_desc FROM my_catalog_items sample(10)
WHERE category_name = :b1
then for categories that have a small number of items this statement often return no rows. I think the sample is taken before applying the where clause, am I correct?

Is there a better way to do this? I want to avoid scanning the entire table since it could be very large.

Tom Kyte

Followup  

November 13, 2002 - 12:55 pm UTC

with a small number of rows -- a 10% sample could EASILY return 0 rows.
1* select dname from dept sample(10)
scott@ORA920.LOCALHOST> /

DNAME
--------------
ACCOUNTING
SALES


scott@ORA920.LOCALHOST> /

no rows selected

with or without a where clause. If the sample IS TOO SMALL -- you stand a very very good chance of randomly getting nothing.


sample does scan the full table the way you are using it - sample BLOCK wouldn't (but the same problem would arise -- maybe even worse!)

Sorry -- but I don't have a "good" solution for you short of "rerun the query if the sample size is deemed too small"

Randomly selecting with 7.3

December 11, 2002 - 1:12 am UTC

Reviewer: Brijesh

Hi Tom,

I have created your random package with 7.3. (we are in the process of upgrading to 902, but it 

will take another 6 months as we have a huge number of users).

We have a requirement :

    We call companies to offer tenders.
    The companies information is stored in a table called "fatwa" which has over a million 

rows.

    In order to give fair chance to companies and to involve no user intervention in 

selecting companies we decided to randomly select a given set of companies and call them to offer 

tender.

    After selecting the companies I am entering the company id of the selected companies in a 

table called "old_fatwa" so that later I can exclude them.
    And when ever I select new companies I query the "fatwa" table like this :

First a count(*) from the table.

SQL> select count(*) from fatwa;

  COUNT(*)
----------
   1179651

Elapsed: 00:00:02.05

And now selecting companies randomly.

SQL> ed
Wrote file afiedt.buf

  1  select fatwaid from
  2     (select * from fatwa where fatwaid not in
  3             (select fatwaid from old_fatwa)
  4     order by random.rand)
  5* where rownum <=5
SQL> /

   FATWAID
----------
     10407
     22342
     50031
     71660
     54781

Elapsed: 00:02:51.03
SQL> /

   FATWAID
----------
     42989
     66700
    215773
    249553
    228372

Elapsed: 00:02:55.00

I am able to get the companies randomly but it is taking too long.....
Can you please help me reduce that.

Thanks Brijesh.
 

Tom Kyte

Followup  

December 11, 2002 - 6:40 am UTC

for a million rows, surprised it is going that fast!

Especially in 73. 1,000,000 calls to random.rand is taking a good portion of that -- just do a loop and time it.

In 7.3 -- not using CBO, the not in is probably an issue as well. did you test how long just:

1 select fatwaid from
2 (select * from fatwa where fatwaid not in
3 (select fatwaid from old_fatwa)
4 )
5*

takes? try:

select a.*
from fatwa a, old_fatwa b
where a.fatwaid = b.fatwaid(+)
and b.fatwaid is null;


getting a random sample from 1million rows is going to take a while -- in 8i and up with dbms_random and the top-n query optimization it will take LESS time, but it'll still take a while.

sorry -- no good answer for you on this one.

Same values using dbms_random.value

May 02, 2003 - 5:22 pm UTC

Reviewer: Max

Tom,

What are the chances that dbms_random.value returns same values in a multiuser environment ? For one of my process, I insert some rows into a table (this is an existing table which is no longer used -- so I use it as a temp table). But since the table has a pk constraint on the column, I simply generate a random value, insert data and delete data when I am done. So technically, at any given time, the table is always empty.

So would it any time happen that 2 users running the same process from different sessions might get the same random value ? What are the chances ?

I know, I can use sequences and get rid of the matter but I just do not want to create an additional object. But if there is a high chance, I will use sequences.

Thanks in advance !

Tom Kyte

Followup  

May 02, 2003 - 7:49 pm UTC

100% at some point.

you want to use a global temporary table -- each session can see only ITS data.

I've never understood the "i don't want to create that which is necessary for my application" thought process. seems counter productive to me.

random numbers !=non repeating numbers

May 03, 2003 - 12:51 pm UTC

Reviewer: Jim from Portland, OR USA

Just because you are generating a random number does not mean you won't get repeats (the same number). The random number generator in Oracle is generates numbers via an algorithium that allows replacement. To generate a list of random numbers with out the possibility of a repeat you would need a random number generator that generates numbers without replacement.

Think of it this way. You have a large bucket of marbles. Each marble has a different number on it. The Oracle Random number generator (and most random number generators) work by picking a marble and telling you the number on it and then replacing the marble back in the bucket. (with replacement)

To pick numbers without replacement you have to pick a marbel and not put it back into the bucket.

The bucket analogy is just an analogy. The random number generator is just a mathmatical formula - there are a lot of them out of there.

So don't use random numbers to generate a primary key. I worked with some developers who tried this and couldn't understand why it failed every so often. (They were getting repeated keys)

creating a view out of sample (8.1.7.4)

May 05, 2003 - 8:13 pm UTC

Reviewer: Scott from Perth, Western Australia

G'day

The sample feature is perfect for our problem. We also want to get random employee records - but just to create random names as dummy data for our customer records in development - due to privacy laws.
I tried to create a view, so I could also use the lag and lead functions (as they don't work in pl/sql), but I receive this error:
FROM employees SAMPLE (1)
*
ERROR at line 6:
ORA-00907: missing right parenthesis

While the query by itself in SQL*Plus works...
SELECT first_name, last_name, addr_1, NULL locality
FROM (SELECT first_name
,NVL(LEAD(last_name) OVER (ORDER BY DBMS_RANDOM.value(1,1000)),LAG(last_name) OVER (ORDER BY DBMS_RANDOM.value(1,1000))) last_name
,NVL(LEAD(addr_1,2) OVER (ORDER BY DBMS_RANDOM.value(1,1000)),LAG(addr_1,1) OVER (ORDER BY DBMS_RANDOM.value(1,1000))) addr_1
FROM employees SAMPLE (1)
ORDER BY DBMS_RANDOM.value(1,1000))
/
FIRST_NAME LAST_NAME ADDR_1
------------------------------ ------------------------------ ----------------------
EVELYN CRAWFORD 1 BATHGATE LOOP

Why can't sample () be used in a view?

Thanks.

Tom Kyte

Followup  

May 05, 2003 - 8:39 pm UTC

Sample is not support in views.

dynamic order by clause

April 20, 2004 - 7:23 pm UTC

Reviewer: A reader

hi Tom
why dont I get the correct order in the second
call to the procedure p. Should not the order by
clause be "order by 2, 1" in the second call?

thanx!
-----------
scott@ORA10G> set echo on
scott@ORA10G> create or replace procedure p( a number, b number )
2 is
3 begin
4 dbms_output.put_line( 'order by '||a ||','|| b);
5 for i in ( select empno, ename
6 from emp order by a, b
7 )
8 loop
9 dbms_output.put_line( i.empno || ',' || i.ename );
10 end loop;
11 end;
12 /

Procedure created.

scott@ORA10G> begin
2 p( 1, 2 );
3 p( 2, 1);
4 end;
5 /
order by 1,2
7369,SMITH
7499,ALLEN
7521,WARD
7566,JONES
7654,MARTIN
7698,BLAKE
7782,CLARK
7788,SCOTT
7839,KING
7844,TURNER
7876,ADAMS
7900,JAMES
7902,FORD
7934,MILLER
order by 2,1 <========== following should order by ename
first?
7369,SMITH
7499,ALLEN
7521,WARD
7566,JONES
7654,MARTIN
7698,BLAKE
7782,CLARK
7788,SCOTT
7839,KING
7844,TURNER
7876,ADAMS
7900,JAMES
7902,FORD
7934,MILLER

PL/SQL procedure successfully completed.
----------

Tom Kyte

Followup  

April 21, 2004 - 12:12 pm UTC

you are ordering by the number 2 and the number 1.

You are NOT ordering by column 2 and then column 1 or column 1 and then column 2.

You are doing something like:

order by 'a', 'b'

order by 'b', 'a'

neither order by really makes sense, but we do it for you.

read
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:15503866429218 <code>

thanx Tom!

April 21, 2004 - 12:17 pm UTC

Reviewer: A reader

I figured as much. In fact it looks like it just
ignores the order by clause if you give one like I did.
This means it gets rows in the default order - which happened to coincide with my first order by clause:)
Interesting solutions at the link you provided - Thanx
a lot!!

Query is not working

April 21, 2004 - 1:28 pm UTC

Reviewer: Thirumal from india

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle8i Release 8.1.5.0.0 - Production
PL/SQL Release 8.1.5.0.0 - Production
CORE Version 8.1.5.0.0 - Production
TNS for 32-bit Windows: Version 8.1.5.0.0 - Production
NLSRTL Version 3.4.0.0.0 - Production

The given query is not working

SQL> select *
  2        from ( select empno, ename
  3                 from emp
  4                        where ename like '%'
  5                        order by dbms_random.value )
  6       where rownum <= 4
  7  /
                      order by dbms_random.value )
                                           *
ERROR at line 5:
ORA-00904: invalid column name

Please help me.
Thirumal
 

Tom Kyte

Followup  

April 21, 2004 - 9:02 pm UTC

ops$tkyte@ORA815> desc dbms_random
PROCEDURE INITIALIZE
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 SEED                           BINARY_INTEGER          IN
FUNCTION RANDOM RETURNS BINARY_INTEGER
PROCEDURE SEED
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 SEED                           BINARY_INTEGER          IN
PROCEDURE TERMINATE
 



no function value in that release. 

Dbms_random was not supported in 8.1.5 i think

April 21, 2004 - 2:30 pm UTC

Reviewer: Riaz from PRAL, Lahore, Pakistan


How to select random employee(s) from each department?

June 23, 2004 - 11:25 am UTC

Reviewer: A reader

Hello Tom,
Is there a way to select random employee(s) from each department in a single SQL query.

I am currently using a cursor in the PL/SQL code with your query (your very first follow up) to achive this.

Thanks,
Raja

Tom Kyte

Followup  

June 23, 2004 - 11:44 am UTC

scott@ORA9IR2> select *
2 from ( select ename, empno,
3 row_number() over (partition by deptno order by dbms_random.random) rn,
4 count(*) over (partition by deptno) cnt
5 from emp
6 )
7 where rn = 1
8 /

ENAME EMPNO RN CNT
---------- ---------- ---------- ----------
KING 7839 1 3
FORD 7902 1 5
ALLEN 7499 1 6

scott@ORA9IR2> /

ENAME EMPNO RN CNT
---------- ---------- ---------- ----------
CLARK 7782 1 3
FORD 7902 1 5
JAMES 7900 1 6

scott@ORA9IR2> /

ENAME EMPNO RN CNT
---------- ---------- ---------- ----------
CLARK 7782 1 3
ADAMS 7876 1 5
BLAKE 7698 1 6

scott@ORA9IR2>


That was perfect and superfast!!

June 23, 2004 - 12:08 pm UTC

Reviewer: Raja

Appreciate your superfast response!

Is this correct behaviour?

August 18, 2004 - 10:51 am UTC

Reviewer: Martin from UK

Hi Tom,

can you explain if this is documented behaviour or not (we're at 9.2.0.4) :

We have 19502 distinct rows in sites :

SQL> SELECT COUNT(*) FROM sites;

  COUNT(*)
----------
     19502

1 row selected.

These two queries just prove that the query returns a random row each time :

SQL> SELECT sit_id
  2  FROM ( SELECT sit_id, ROW_NUMBER() OVER (ORDER BY dbms_random.value) rn FROM sites )
  3  WHERE rn = 1
  4  /

    SIT_ID
----------
  73458946

1 row selected.

SQL> /

    SIT_ID
----------
  76248570

1 row selected.

However, put the query as an inline select and we always get the same row for every row in the outer query :
SQL> SELECT
  2    x_y,
  3    ( SELECT sit_id
  4        FROM ( SELECT sit_id, ROW_NUMBER() OVER (ORDER BY dbms_random.value) rn FROM sites )
  5       WHERE rn = 1 ) site_id
  6  FROM t2
  7  /

X    SITE_ID
- ----------
x   76362038
x   76362038

2 rows selected.

SQL> /

X    SITE_ID
- ----------
x   76392370
x   76392370

2 rows selected.

Thanks in advance

 

Tom Kyte

Followup  

August 18, 2004 - 11:03 am UTC

the scalar subquery is not executed each time -- in this case -- since there are no varying inputs to it -- it'll be executed only ONCE.

scalar subqueries are neat that way, the silently cache their results and we reuse them.

to alter this, send a bind down

"from sites where t2.rowid is not null"

for example -- t2.rowid will change for each row.


scott@ORA9IR2> select empno
2 from (select empno, row_number() over (order by dbms_random.value) rn
3 from emp)
4 where rn = 1
5 /

EMPNO
----------
7844

scott@ORA9IR2> /

EMPNO
----------
7499

scott@ORA9IR2>
scott@ORA9IR2>
scott@ORA9IR2> select deptno, (
2 select empno
3 from (select empno, row_number() over (order by dbms_random.value) rn
4 from emp)
5 where rn = 1 ) random_emp
6 from dept
7 /

DEPTNO RANDOM_EMP
---------- ----------
10 7369
20 7369
30 7369
40 7369

scott@ORA9IR2> select deptno, (
2 select empno
3 from (select empno, row_number() over (order by dbms_random.value) rn
4 from emp)
5 where rn = 1 and dept.rowid is not null) random_emp
6 from dept
7 /

DEPTNO RANDOM_EMP
---------- ----------
10 7521
20 7654
30 7499
40 7902



August 18, 2004 - 12:24 pm UTC

Reviewer: A reader


Clarification : Randomly selecting with 7.3 December 11, 2002

August 18, 2004 - 8:04 pm UTC

Reviewer: VKOUL from Lacey, WA USA

It is kind of very late, as this relates to Dec 2002

Excerpt :

"In 7.3 -- not using CBO, the not in is probably an issue as well. did you test
how long just:

1 select fatwaid from
2 (select * from fatwa where fatwaid not in
3 (select fatwaid from old_fatwa)
4 )
5*

takes? try:

select a.*
from fatwa a, old_fatwa b
where a.fatwaid = b.fatwaid(+)
and b.fatwaid is null;
"

If we do not need other columns, then the query

1 select fatwaid from
2 (select * from fatwa where fatwaid not in
3 (select fatwaid from old_fatwa)
4 )
5*

can be written more efficiently

SELECT fatwaid
FROM fatwa
MINUS
SELECT fatwaid
FROM old_fatwa


October 14, 2004 - 12:57 am UTC

Reviewer: Prasad Gunaratne from Australia


Problem with select clause

November 03, 2004 - 11:49 am UTC

Reviewer: A reader

Hi Tom,

I have a procedure which constructs a query dynamically and executes it. The query which is constructed is like this

SELECT a.col1, a.col2, a.col3, a.col4, b.col5, A.col6, a.col7
FROM A, B WHERE A.ID = B.ID (+)
AND A.ID1 IN (SELECT distinct c.ID1 FROM c
WHERE c.TYPE = '13' )
AND A.TIME = ( SELECT MAX(C.TIME) FROM C
WHERE C.ID1 = A.ID1)

The problem I am having is this query returns 'no rows selected'. but if I remove one col from the select clause and change it like

SELECT a.col1, a.col2, a.col3, a.col4, b.col5, A.col6

and omit col7 then I am getting the desired results. why does this happen? is there any restriction on the select clause? Pls help.

Thanks.


Tom Kyte

Followup  

November 04, 2004 - 1:38 am UTC

if what you say is accurate that would be "a bug".

(you never need distinct in a subquery).


do you have a simple test case to reproduce with? but if what you say is true, you need to contact support.

Another suggestion and a bug, maybe?

November 04, 2004 - 3:06 pm UTC

Reviewer: Bob B from Albany, NY

I was playing around with another solution that would use analytics to get the row count and then get one random number for each employee that needs to be selected (first row only). It would then mod each random number by the row count and grab the employee with that as their rownum.

It worked for 1 employee, but when I added to the query to get a second employee, I think I may have found a bug. When I run this in 9ir2 r_val1 and r_val2 are equal, even though the query shows that val1 and val2 are different in the first query (well, different most of the time, random could return the same number twice)

SELECT
OBJECT_ID,
rn,
val1,
val2,
FIRST_VALUE( val1 ) OVER ( ORDER BY rn ) r_val1,
FIRST_VALUE( val2 ) OVER ( ORDER BY rn ) r_val2
FROM (
SELECT
ao.OBJECT_ID,
ROWNUM rn,
CASE
WHEN ROWNUM = 1
THEN DBMS_RANDOM.RANDOM
END val1,
CASE
WHEN ROWNUM = 1
THEN DBMS_RANDOM.RANDOM
END val2,
COUNT(1) OVER () cnt
FROM ALL_OBJECTS ao
WHERE ROWNUM <= 2
)

Before I send that in as a bug, I was wondering if you got the same results on your end and if so, if that is what is expected.

Tom Kyte

Followup  

November 05, 2004 - 2:52 pm UTC

yes, looks strange.


ops$tkyte@ORA9IR2> SELECT
  2    OBJECT_ID,
  3    rn,
  4    val1,
  5    val2,
  6    FIRST_VALUE( val1 ) OVER ( ORDER BY rn ) r_val1,
  7    FIRST_VALUE( val2 ) OVER ( ORDER BY rn ) r_val2
  8  FROM (
  9  SELECT
 10    ao.user_id OBJECT_ID,
 11    ROWNUM rn,
 12    CASE
 13      WHEN ROWNUM = 1
 14      THEN DBMS_RANDOM.RANDOM
 15    END val1,
 16    CASE
 17      WHEN ROWNUM = 1
 18      THEN DBMS_RANDOM.RANDOM
 19    END val2,
 20    COUNT(1) OVER () cnt
 21  FROM ALL_users ao
 22  WHERE ROWNUM <= 2
 23  )
 24  /
 
 OBJECT_ID         RN       VAL1       VAL2     R_VAL1     R_VAL2
---------- ---------- ---------- ---------- ---------- ----------
         0          1   94386301 1119652727   94386301   94386301
         5          2                         94386301   94386301
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> SELECT
  2    OBJECT_ID,
  3    rn,
  4    val1,
  5    val2,
  6    FIRST_VALUE( val1 ) OVER ( ORDER BY rn ) r_val1,
  7    FIRST_VALUE( val2 ) OVER ( ORDER BY rn ) r_val2
  8  FROM (
  9  SELECT
 10    ao.user_id OBJECT_ID,
 11    ROWNUM rn,
 12    CASE
 13      WHEN ROWNUM = 1
 14      THEN DBMS_RANDOM.RANDOM
 15    END val1,
 16    CASE
 17      WHEN ROWNUM = 1
 18      THEN DBMS_RANDOM.RANDOM+0
 19    END val2,
 20    COUNT(1) OVER () cnt
 21  FROM ALL_users ao
 22  WHERE ROWNUM <= 2
 23  )
 24  /
 
 OBJECT_ID         RN       VAL1       VAL2     R_VAL1     R_VAL2
---------- ---------- ---------- ---------- ---------- ----------
         0          1 1066352411 1731447925 1066352411 1731447925
         5          2                       1066352411 1731447925
 
 

Overzealous optimizer

November 05, 2004 - 5:31 pm UTC

Reviewer: Bob B from Albany, NY

Ah, I guess the optimizer was a little overzealous in this case.

sample result does not match percentage

November 08, 2004 - 7:16 am UTC

Reviewer: A reader

hi tom,


the sample percentage clause doesn't match the given percentage exactly but only in an "up to" pecentage manner. why?

is there a way to return a given percentage of rows EXACTLY using the sample clause syntax?


SQL> create table test as select rownum r from all_tables where rownum <= 100;

Tabelle wurde angelegt.

SQL> select count (*) from test;

  COUNT(*)
----------
       100

SQL> select * from test sample (5);

         R
----------
         4
        96
        97

SQL> select * from test sample (5);

         R
----------
        17
        51
        58
        74

SQL> select * from test sample (5);

         R
----------
        14
        32
        94
        97
 

Tom Kyte

Followup  

November 08, 2004 - 4:42 pm UTC

no, it is a random sample of "about X%" -- random sample.

it could return all records at some point
or zero

it "conceptually" takes each row, generates a random number between say 0 and 1, if number <= 0.05 -- output it, else no.


The larger the base table -- the closer in general the number of rows will be to an exact percentage.


(but if you think about random sampling, it could not be random if it were exact! you'd always get "more from the end" for example -- as we try to make it "exact")

Repeat random selection

November 21, 2004 - 2:04 am UTC

Reviewer: PK from India

Why the followig query is giving same value for all 10 rows returned?

select (select deptno from
(select deptno from dept
order by dbms_random.random)
where rownum <2) as V1
from all_objects where rownum<11
/

Is there any way I can repeat a random selection and at the same time generate 'n' number of rows in a single query?

Thanks.

Tom Kyte

Followup  

November 21, 2004 - 9:12 am UTC

scalar subquery caching.

if the inputs to the scalar subquery do not change, it is not re-executed, no need to. (don't ascribe procedural concepts to SQL, you'll get burnt -- analytics maybe, but not sql in general)


scott@ORA9IR2> select (select deptno from
2 (select deptno from dept
3 order by dbms_random.random)
4 where rownum <2 and nvl(all_objects.object_id,0) is not null) as V1
5 from all_objects where rownum<11
6 /

V1
----------
30
30
40
30
10
20
30
20
20
20

10 rows selected.


by correlating the scalar subquery (passing in something) and passing in something that is unique for each row -- we can make the "caching" go away



Calculating percentage of Count(*) records

November 08, 2005 - 12:12 pm UTC

Reviewer: denni50 from na

Hi Tom!!
it's been a while..hope all is well with you!

I have a table filled with the following appealcodes(and total counts)..I need to build another table taking 25% of each appealcode and inserting those records into another table...so as an example:
appealcode DCH05MA005 would contain 1895 records from the original count of 7579 and have those randon records inserted into the new table...same with the rest of appealcodes.

thanks for any help/tips...been on a life changing journey these past months.

btw this needs to be done using sql in MSAccess..just need sql logic can tweak to work in Access...thanks.

appealcode Count 0.25
DCH05MA005 7579 1895
DCH05MA006 273 68
DCH05MA007 245 61
DCH05MA008 43 11
DCH05MA009 7148 1787
DCH05MA010 343 86
DCH05MA011 278 70
DCH05MA030 2086 522
DCH05MA031 152 38
DCH05MA032 93 23
DCH05MA033 25 6
DCH05MA055 3600 900
DCH05MA056 366 92
DCH05MA057 275 69
DCH05MA058 92 23
DCH05MA154 3857 964
DCH05MA155 821 205
DCH05MA156 1203 301
DCH05MA157 517 129
DCH05MA158 4827 1207


Tom Kyte

Followup  

November 08, 2005 - 10:25 pm UTC

well, you had my attention until you said "access"

are you using access as the front end or as the database?

Moving percentage of records to another table

November 09, 2005 - 8:28 am UTC

Reviewer: denni50 from na

doesn't matter Oracle or Access...sql is sql...just
want to know the logic...you can provide it from the
standpoint of Oracle..I can convert logic to Access Sql.

had to get this done yesterday so I came up with a workaround of flagging every nth(5th record) with a code
then created a new table where column=flagged code..
however I know there's a way to do this with a formula/calculation.

thanks

Tom Kyte

Followup  

November 11, 2005 - 9:59 am UTC

sql is not sql.

analytics are not available in access.

the "sql is sql" thought is dangerous, don't fall into that trap.

thanks for the tip..

November 11, 2005 - 10:32 am UTC

Reviewer: denni50 from na

just for my own knowledge I'd appreciate the Oracle
version using AF to accomplish what I requested in previous
post.

thanks


Tom Kyte

Followup  

November 12, 2005 - 8:39 am UTC

if I had a create table and insert intos - I would :)

no need...

November 12, 2005 - 5:05 pm UTC

Reviewer: denni50 from na

figured it out got it working in both oracle and access.



Random row performance test

January 31, 2006 - 10:39 am UTC

Reviewer: Joseph Bui from Washington, DC USA

Here is an implementation on Oracle 8.1.7 to get one row randomly and quickly from large tables.

SQL> create or replace function get_random_rowid
( table_name varchar2
) return urowid
as
sql_v varchar2(100);
urowid_t dbms_sql.urowid_table;
cursor_v integer;
status_v integer;
rows_v integer;
begin
  for exp_v in -6..2 loop
    exit when (urowid_t.count > 0);
    if (exp_v < 2) then
      sql_v := 'select rowid from ' || table_name
      || ' sample block (' || power(10, exp_v) || ')';
    else
      sql_v := 'select rowid from ' || table_name;
    end if;
    cursor_v := dbms_sql.open_cursor;
    dbms_sql.parse(cursor_v, sql_v, dbms_sql.native);
    dbms_sql.define_array(cursor_v, 1, urowid_t, 100, 0);
    status_v := dbms_sql.execute(cursor_v);
    loop
      rows_v := dbms_sql.fetch_rows(cursor_v);
      dbms_sql.column_value(cursor_v, 1, urowid_t);
      exit when rows_v != 100;
    end loop;
    dbms_sql.close_cursor(cursor_v);
  end loop;

  if (urowid_t.count > 0) then
    return urowid_t(trunc(dbms_random.value(0, urowid_t.count)));
  end if;

  return null;

exception when others then
  if (dbms_sql.is_open(cursor_v)) then
    dbms_sql.close_cursor(cursor_v);
  end if;
  raise;
end;
/
show errors

Function created.

Elapsed: 00:00:00.94
No errors.
SQL> select count(*) from my_table;

  COUNT(*)
----------
  74152137

1 row selected.

Elapsed: 00:29:23.13
SQL> set autotrace on
SQL> select get_random_rowid('my_table') from dual;

GET_RANDOM_ROWID('
------------------
AABLs0AAcAAALHkABm

1 row selected.

Elapsed: 00:00:06.36

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'DUAL'

Statistics
----------------------------------------------------------
        177  recursive calls
       9344  db block gets
        162  consistent gets
        584  physical reads
          0  redo size
        227  bytes sent via SQL*Net to client
        207  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> set autotrace traceonly
SQL> select * from my_table
where rowid = get_random_rowid('my_table');

1 row selected.

Elapsed: 00:00:06.15

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=66)
   1    0   TABLE ACCESS (BY USER ROWID) OF 'MY_TABLE' (Cost=1 Card=1 Bytes=66)

Statistics
----------------------------------------------------------
         10  recursive calls
       9332  db block gets
        109  consistent gets
        614  physical reads
          0  redo size
        536  bytes sent via SQL*Net to client
        232  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 

Random data sampling

September 28, 2007 - 3:08 pm UTC

Reviewer: Sara from CT,USA

Hi Tom,

I've a transaction table(Invoice) like below:
INVOICE_NO, STATE_CD
1 1
2 2
3 3
4 1
5 4
....
and a reference table(State) like below:

STATE_CD STATE_DESC
1 Alabama
2 California
3 New Jersey
4 Pennsylvania
5 Virginia

If I select 4 random records from the invoice table, query should return invoice_no 1,2,3 and 5. My objective is, select transactional random records(say 10%) that should cover all unique reference data.

Is there any way to implement in SQL or PL/SQL?

would really appreciate your reply
Tom Kyte

Followup  

October 03, 2007 - 12:57 pm UTC

you'd need to randomly sample 10%/number_of_reference_items - number of reference items times, one by one, to get a complete cover (recognizing that of course any of the 10%/number_of_reference_items sample queries is allowed to return ZERO records)

DBMS_UTILITY.GET_HASH_VALUE

August 05, 2008 - 3:36 am UTC

Reviewer: Chetan from Kuwait

Hi Tom,
As regards to the original question why couldnt we just use DBMS_UTILITY.GET_HASH_VALUE instead of generating random values. Is the generation of random numbers faster than using the hash value function? And also are the hash values that are generated more "random" than the inbuilt random number generator?
Tom Kyte

Followup  

August 05, 2008 - 9:17 am UTC

hashing is not random, hashing is rather quite predicable - we use hashes to PREDICTABLY get the same values given the same inputs over and over.

the requirement was "randomly select"

Updating random set of rows

October 16, 2008 - 11:39 am UTC

Reviewer: Reinhard from Valhalla, NY USA

Tom,

I attempted to leverage the code to randomly select a subset of rows from a table and use it to update a status code in that table. I am getting an ORA-30560 which I cannot even find in the 10g Documentation:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Data Mining and Real Application Testing options

Selecting a random set of vendors works fine:

SQL> select vendor_id from (select vendor_id
2 from sap_cmf_vendor_synch sample ( 1 )
3 where vendor_nm like '%%'
4 order by dbms_random.random )
5 where rownum <= 5;

VENDOR_ID
----------
0040085418
0040126911
0040131900
0040116852
0040083935

SQL> /

VENDOR_ID
----------
0040133509
0040109878
0040084444
0040129614
0040112381

but trying to update based on this list fails:

SQL> update sap_cmf_vendor_synch a
2 set a.status_code='R'
3 where a.vendor_id in (
4 select vendor_id from (select vendor_id
5 from sap_cmf_vendor_synch sample ( 1 )
6 where vendor_nm like '%%'
7 order by dbms_random.random )
8 where rownum <= 5);
from sap_cmf_vendor_synch sample ( 1 )
*
ERROR at line 5:
ORA-30560: SAMPLE clause not allowed

Any idea why? Can you think of a workaround?

Thanks in advance.

Tom Kyte

Followup  

October 17, 2008 - 9:07 pm UTC

umm, because the sample clause is not allowed (seems sort of obvious, it is the text of the error message?) - it is not deterministic, we frown on that in modifications - so much so, we don't let you do it.


what are you trying to accomplish here (for real, in the form of a specification like you would give to a coder to implement).

the sample subquery might return 0 through 5 different vendor ids - this seems like a really bad idea.

If you really want to do it, two step it - step A, get the N vendors, step B, run a deterministic update to update those five.

Updating random set of rows

October 16, 2008 - 12:07 pm UTC

Reviewer: Reinhard from Valhalla, NY USA

Follow-up:

I will partially answer my own question. I removed the SAMPLE clause from my update statement and I got the results I was looking for.

I am still not sure why I got the error nor why the 10g Documentation does not explain the error.

Thank you, Tom, for your tireless quest to encourage people to THINK.

Updating random set of rows

October 22, 2008 - 1:10 pm UTC

Reviewer: Reinhard from Valhalla, NY USA

The requirement was simply to change the status of a subset of rows in a table for testing purposes only. The table has 42,000+ rows in it and I wanted to programatically make 'n < 42000' rows available to the downstream processes for unit testing.

This was not anything that would ever make into a production system.

Thanks again for taking the time to respond.

random select specific number of rows for each employee

May 01, 2013 - 8:01 pm UTC

Reviewer: ChrisNZak from USA

We were given and assignment to select only two records from the list of agents..i.e 2 random records for each agent over the span of a week etc.... and below is what we got and it works

with summary as ( Select Dbms_Random.Random As Ran_Number, colmn1, colm2, colm3 Row_Number() Over(Partition By col2 Order By Dbms_Random.Random) As Rank From table1, table2 Where Table1.Id = Table2.Id Order By Dbms_Random.Random Asc) Select tab1.col2, tab1.col4, tab1.col5, From Summary s Where s.Rank <= 2;

More to Explore

DBMS_RANDOM

More on PL/SQL routine DBMS_RANDOM here