Paul, June 25, 2020 - 3:06 pm UTC
Thanks for the suggestion it was helpful, especially the part about the hint, I was unaware of it.
I would prefer not to reinvent the wheel as I have a working solution, which I posted.
For academic purposes can you please explain why my solution was not modified? Was there a problem with the solution I provided.
I would really like to see where I went wrong trying to implement the random rows per employee. Thanks for your time and expertise. I await your response
June 26, 2020 - 3:21 pm UTC
For academic purposes can you please explain why my solution was not modified? Was there a problem with the solution I provided.
Two reasons:
- Isolating the example to what you're asking for (repeat each employee's row a random number of times beween N-M) makes it easier to show and explain what's going on. This makes the question more valuable to other users of this site. Including this in your example should be relatively easy (no need to "reinvent the wheel").
- I'm not entirely clear where exactly in your code you're expecting this N row generation to go!
Paul, June 26, 2020 - 3:54 pm UTC
Your solution seemed to work fine. I tried creating a cartesian product between employee and locations table and was unable to get that to work so I could not go any further and create a random date.
I thought I was almost there with what I provided in my test case and was curious where it fell short.
June 26, 2020 - 4:55 pm UTC
I need some context about what you're trying to achieve here.
Start from the beginning: what exactly are you hoping to be in each table after inserting the data?
Paul, June 26, 2020 - 5:45 pm UTC
For every employee in this case (50) I want a random number (1-20) rows. For each one of those rows I want to associate a random location_id from the locations table along with a random date MMDDYYYY HH24:MI:SS format from sysdate- sysdate+2.
In your example, I was unable to incorporate the locations record and the random date.
I took another approach using the code below, which appears to work fine but I suspect less efficient as your solution.
In an effort to better myself I was trying to understand the shortcomings of my original solution.
Having said that and the risk of not going too far off a tangent I recently sawcchris Sexton video on joins and was wondering if my new solution is a candidate for hash joins.
SELECT e_id, card_num, l_id,
trunc(sysdate) + dbms_random.value (0, 2) + dbms_random.value (0, .75)
FROM (
SELECT employee_id AS e_id, row_number() OVER (ORDER BY dbms_random.value) AS rn, card_num
FROM employees
) x
JOIN (SELECT location_id AS l_id, row_number() OVER (ORDER BY dbms_random.value) AS rn FROM locations) y USING (rn);
June 29, 2020 - 12:59 pm UTC
You're not getting a random number of rows per employee though - just one.
So what are you looking for?
ONE row per employee linked to a random location?
or
1 - 20 rows for per employee, each of which are joined to a (different?) random location?
or
something else?
I recently saw Chris Sexton video on joins
You mean Chris Saxon? As in me? ;)
my new solution is a candidate for hash joins
The optimizer decides whether to use hash/nested loops/merge join.
Adding more details
Paul, June 29, 2020 - 1:08 pm UTC
It's been worked out. Thanks for your time
INSERT into access_history
(employee_id,
card_num,
location_id,
access_date,
processed)
WITH
emp AS (SELECT e.employee_id, e.card_num, ROWNUM rn FROM employees SAMPLE (99.999) e),
numbers AS (SELECT dbms_random.value(1,25) val, ROWNUM rn
FROM dual CONNECT BY LEVEL <= (SELECT COUNT(*) FROM emp)),
emps AS (SELECT e.employee_id, e.card_num, ROWNUM rn
, (SELECT val FROM numbers n WHERE n.rn = e.rn) nop
FROM emp e,
TABLE(CAST(MULTISET(
SELECT LEVEL lvl FROM dual
CONNECT BY LEVEL <= (SELECT val FROM numbers n WHERE n.rn = e.rn))
AS sys.odciNumberList))),
loc0 AS (SELECT location_id,
ROW_NUMBER() OVER (ORDER BY dbms_random.value) rn,
COUNT(*) OVER() cnt
FROM locations),
loc AS (SELECT TRUNC(dbms_random.value(1,m.cnt)) loc_row, ROWNUM rn
FROM (SELECT 1, cnt FROM loc0 WHERE ROWNUM = 1) m
CONNECT BY LEVEL <= (SELECT COUNT(*) FROM emps))
SELECT e.employee_id, e.card_num, l0.location_id, trunc (sysdate) + dbms_random.value (0, 2.75), 0
FROM emps e, loc l, loc0 l0
WHERE e.rn = l.rn
AND l.loc_row = l0.rn;
June 29, 2020 - 2:36 pm UTC
OK, I'm pretty sure you can simplify that though...