Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Charlie.

Asked: June 17, 2002 - 11:15 am UTC

Last updated: August 08, 2022 - 4:33 pm UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

Hi Tom,

I have a table as follows:

SQL> desc t;
Name Null? Type
----------------------------------------------------- -------- ---------------
NAME VARCHAR2(20)
TIME_SHIFT NUMBER(4)
AUTO_ID NUMBER
STATE VARCHAR2(3)

SQL> select * from t;

NAME TIME_SHIFT AUTO_ID STATE
-------------------- ---------- ---------- -----
a 12 1 A
a 13 2 B
a 13.4 3 C
a 18 4 A
a 19 5 B
a 22 6 A
a 22 7 B
b 8 8 C
b 9 9 D
c 1 10 C
c 2 11 C
c 3 12 C
c 4 13 C

13 rows selected.


I'd like to generate a report based on the following conditions:
- same name
- different state
- time shift range +/- 1 (time_shift-1 <= value <= time_shift+1) among rows

If the result matches these conditions, I called it a "GROUP". The desired output
will look like this:

auto_id_group avg_time_shift name
------------- -------------- ----
1,2,3 12.8 a
4,5 18.5 a
6,7 22 a
8,9 8.5 b

where,
auto_id_group - comma seperated sorted auto_id within "GROUP"
avg_time_shift - average of time_shift within "GROUP"

if the data were:

NAME TIME_SHIFT AUTO_ID STATE
-------------------- ---------- ---------- -----
a 12 1 A
a 13 2 B
a 13.4 3 C
a 14.4 4 A
...

14.4 row WILL NOT be in the 1,2,3 group since state in the "GROUP" has
to be different.

if the data were:

NAME TIME_SHIFT AUTO_ID STATE
-------------------- ---------- ---------- -----
a 12 1 A
a 13 2 B
a 13.4 3 C
a 14.4 4 D
...

would that 14.4 row be in the 1,2,3 group -- even though the spread from the first is 2.4?

>>>
Sorry about confusion, Tom.

There will be no 1,2,3 group but 1,2,3,4 group since they have:
- same name
- different state
- time shift range +/- 1 (PREVIOUS_ROW_time_shift-1 <= CURRENT_ROW_SHIFT <= PREVIOUS_ROW_time_shift+1) among rows. For example, 13-12=1, 13.4-13 <1, 14.4-13.4=1

Thanks,

and Tom said...

Cannot think of a "pure sql" approach to this particular one. I can think of a procedural one that works good in 9i using pipelined functions (had to change your number(4) to number(4,2) to do the math)

ops$tkyte@ORA9I.WORLD> create or replace type myScalarType as object
2 ( auto_id_group varchar2(40),
3 avg_time_shift number,
4 name varchar2(5)
5 )
6 /

Type created.

ops$tkyte@ORA9I.WORLD>
ops$tkyte@ORA9I.WORLD> create or replace type myTableType as table of myScalarType
2 /

Type created.

ops$tkyte@ORA9I.WORLD>
ops$tkyte@ORA9I.WORLD> create or replace function func( p_cursor in SYS_REFCURSOR )
2 return myTableType
3 PIPELINED
4 as
5 l_rec t%rowtype;
6
7 l_last_name varchar2(5);
8 l_last_time_shift number;
9 l_last_states varchar2(4000);
10 l_auto_group_id varchar2(4000);
11 l_sum_time_shift number;
12 l_cnt_time_shift number;
13 begin
14 loop
15 fetch p_cursor into l_rec;
16 exit when p_cursor%notfound;
17
18 if ( l_rec.name <> l_last_name OR
19 instr( '/' || l_rec.state || '/', l_last_states ) > 0 OR
20 abs( l_rec.time_shift-l_last_time_shift ) > 1 OR
21 l_auto_group_id is null )
22 then
23 if ( l_auto_group_id is not null )
24 then
25 pipe row (myScalarType( l_auto_group_id,
26 l_sum_time_shift/l_cnt_time_shift, l_last_name));
27 end if;
28
29 l_auto_group_id := l_rec.auto_id;
30 l_sum_time_shift := l_rec.time_shift;
31 l_cnt_time_shift := 1;
32 l_last_name := l_rec.name;
33 l_last_states := '/' || l_rec.state || '/';
34 l_last_time_shift:= l_rec.time_shift;
35 else
36 l_auto_group_id := l_auto_group_id || ',' || l_rec.auto_id;
37 l_sum_time_shift := l_sum_time_shift + l_rec.time_shift;
38 l_cnt_time_shift := l_cnt_time_shift + 1;
39 l_last_states := l_last_states || l_rec.state || '/';
40 l_last_time_shift:= l_rec.time_shift;
41 end if;
42 end loop;
43 if ( l_auto_group_id is not null )
44 then
45 pipe row (myScalarType( l_auto_group_id,
46 l_sum_time_shift/l_cnt_time_shift, l_last_name));
47 end if;
48
49 return;
50 end;
51 /

Function created.

ops$tkyte@ORA9I.WORLD>
ops$tkyte@ORA9I.WORLD> select *
2 from TABLE( func( CURSOR (select * from t order by name, time_shift) ) )
3 /

AUTO_ID_GROUP AVG_TIME_SHIFT NAME
-------------------- -------------- ------------------------------
1,2,3 12.8 a
4,5 18.5 a
6,7 22 a
8,9 8.5 b
10 1 c
11 2 c
12 3 c
13 4 c

8 rows selected.

In 8i, this could be:

ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace function func( p_query in varchar2 )
2 return myTableType
3 as
4 type rc is ref cursor;
5 l_cursor rc;
6 l_rec t%rowtype;
7
8 l_last_name varchar2(5);
9 l_last_time_shift number;
10 l_last_states varchar2(4000);
11 l_auto_group_id varchar2(4000);
12 l_sum_time_shift number;
13 l_cnt_time_shift number;
14
15 l_data myTableType := myTableType();
16 begin
17 open l_cursor for p_query;
18 loop
19 fetch l_cursor into l_rec;
20 exit when l_cursor%notfound;
21
22 if ( l_rec.name <> l_last_name OR
23 instr( '/' || l_rec.state || '/', l_last_states ) > 0 OR
24 abs( l_rec.time_shift-l_last_time_shift ) > 1 OR
25 l_auto_group_id is null )
26 then
27 if ( l_auto_group_id is not null )
28 then
29 l_data.extend;
30 l_data(l_data.count) :=
31 myScalarType( l_auto_group_id,
32 l_sum_time_shift/l_cnt_time_shift, l_last_name);
33 end if;
34
35 l_auto_group_id := l_rec.auto_id;
36 l_sum_time_shift := l_rec.time_shift;
37 l_cnt_time_shift := 1;
38 l_last_name := l_rec.name;
39 l_last_states := '/' || l_rec.state || '/';
40 l_last_time_shift:= l_rec.time_shift;
41 else
42 l_auto_group_id := l_auto_group_id || ',' || l_rec.auto_id;
43 l_sum_time_shift := l_sum_time_shift + l_rec.time_shift;
44 l_cnt_time_shift := l_cnt_time_shift + 1;
45 l_last_states := l_last_states || l_rec.state || '/';
46 l_last_time_shift:= l_rec.time_shift;
47 end if;
48 end loop;
49 close l_cursor;
50 if ( l_auto_group_id is not null )
51 then
52 l_data.extend;
53 l_data(l_data.count) :=
54 myScalarType( l_auto_group_id,
55 l_sum_time_shift/l_cnt_time_shift, l_last_name);
56 end if;
57
58 return l_data;
59 end;
60 /

Function created.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select *
2 from TABLE( cast( func('select * from t order by name, time_shift')
3 as myTabletype ) )
4 /

AUTO_ID_GROUP AVG_TIME_SHIFT NAME
---------------------------------------- -------------- ------------------------------
1,2,3 12.8 a
4,5 18.5 a
6,7 22 a
8,9 8.5 b
10 1 c
11 2 c
12 3 c
13 4 c

8 rows selected.


sorry -- no really "cool" answer for this one.

Rating

  (27 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

followup

Charlie, June 18, 2002 - 2:11 pm UTC

Thank you very much for your quick response!  I tried your approach for 8i but it finds auto_id 3, which seems not right.

SQL> select * from t;

NAME                 TIME_SHIFT    AUTO_ID STATE
-------------------- ---------- ---------- -----
a                            12          1 A
a                            13          2 B
a                            14          3 A  <- same state as auto_id 1, which should be excluded
a                            18          4 A
a                            19          5 B
b                             8          6 C
b                             9          7 C
aa                            1          8 C
aa                            2          9 C
aa                            3         10 C
aa                            4         11 C

11 rows selected.

Elapsed: 00:00:00.10
SQL> select * from TABLE(cast( func('select * from t order by name, time_shift') as myTabletype ) );

AUTO_ID_GROUP                            AVG_TIME_SHIFT NAME
---------------------------------------- -------------- -----
1,2,3                                                13 a
4,5                                                18.5 a
8                                                     1 aa
9                                                     2 aa
10                                                    3 aa
11                                                    4 aa
6                                                     8 b
7                                                     9 b

8 rows selected.


I'm thinking is there any way we can use powerful analytic function(lag, window range/rows) and combine
with max(decode(...)) technique to flatten auto_ids as 1,2,3,4... or somthing like that.

Thanks, 

Tom Kyte
June 18, 2002 - 2:44 pm UTC

sorry -- had the instr backwards:

instr( l_last_states, '/' || l_rec.state || '/' ) > 0 OR


is what it should be. That'll put 3 into a group of its own when 3 has a state of 'A'. Now, if you just want to IGNORE it, you'll need to supply appropriate logic on your own to get that result (the concept is here, you'll just need to add the right code to make it perfect for your requirements)



almost similar problem

dxl, June 15, 2004 - 4:54 am UTC

Tom

I'm sure this is a simple problem for you. If I have a table:

drop table t1;

create table t1 (c1 varchar2(100), c2 varchar2(100));


insert into t1 values ('aa','abcd');
insert into t1 values ('aa','efgh');
insert into t1 values ('aa','ijkl');
insert into t1 values ('aa','mnop');
insert into t1 values ('aa','qrst');
insert into t1 values ('aa','uvwx');
insert into t1 values ('aa','yz');


insert into t1 values ('bb','1234');
insert into t1 values ('bb','5678');
insert into t1 values ('bb','910');
insert into t1 values ('bb','1112');
insert into t1 values ('bb','1314');

column c1 format a6
column c2 format a6

select c1,c2 from t1;

C1 C2
------ ------
aa abcd
aa efgh
aa ijkl
aa mnop
aa qrst
aa uvwx
aa yz
bb 1234
bb 5678
bb 910
bb 1112
bb 1314

i would like to concatenate the c2 column using a delimiter and grouped by c1.
However if the concatenated column gets too long,ie gets to a specified length (which can be specified at runtime),
then a new row is started. So for my table t1 if i choose a max length of 20 for the new c2 column then i would see


C1 new c2
------ ------
aa abcd~efgh~ijkl~mnop
aa qrst~uvwx~yz
bb 1234~5678~910~1112
bb 1314


note that the first string abcd~efgh~ijkl~mnop is only 19 long but the next element could not be added since it would take it over
the max length of 20, so the next element begins the next line.

I can probably work out how to do this procedurally but is there a more "cool" way to do it in sql??
If not and procedurally is the way to go, what do you think would be a good way, using collections, ie build an object and
a table of objects??

Thanks for your help

Tom Kyte
June 15, 2004 - 3:36 pm UTC

this is procedural.

pipelined function taking a cursor as input comes to mind.

thanks

dxl, June 16, 2004 - 4:25 am UTC

Please can you expand on that. I am not very familiar with pipelined functions. We are on 8.1.7.4, would i still need to use collections with the pipelined functions??

Tom Kyte
June 16, 2004 - 12:39 pm UTC

not an option in 817.

search this site for pipelined, lots of examples out there.

A SQL Model Clause solution for the tricky original question !

Frank Zhou, November 15, 2006 - 5:46 pm UTC

Tom,
    Here is a 10G SQL Model Clause solution for the tricky original question.

SQL Model clause   rock...
Analytics          roll....

Thanks,

Frank

SQL> SELECT Auto_id_Group,
 sum_time/(LENGTH(Auto_id_Group)- LENGTH(REPLACE(Auto_id_Group, ',', '')) +1) AVG_TIME_SHIFT, name
  3  FROM
  4  (SELECT name, Auto_id_Group, sum_time
  5  FROM
  6   (SELECT name, Auto_id_Group, sum_time
  7      FROM   t
  8      MODEL
  9   DIMENSION BY
 10      (name,
 11       row_number() over (partition by name ORDER BY  time_shift) id
 12      )
 13   MEASURES(time_shift,
 14         0 flag,
 15      CAST(NULL AS NUMBER) sum_time,
 16      state state,
 17      CAST(auto_id AS VARCHAR2(3)) auto_id,
 18      CAST(NULL AS VARCHAR2(3255)) name_group,
 19      CAST(NULL AS VARCHAR2(3255)) IdGroup_temp,
 20      CAST(NULL AS VARCHAR2(3255)) Auto_id_Group
 21            )
 22       RULES  ITERATE(2)
 23    (
 24      ------------------------ Auto_id_Group
 25      Auto_id_Group[ANY, ANY] ORDER BY  name, id =
 26      CASE WHEN flag[CV(), CV()]  = 1
 27      THEN  CASE WHEN  IdGroup_temp[CV(), CV()+1 ] IS NOT NULL
 28               THEN CASE
 29               WHEN instr( IdGroup_temp[CV(),CV()+1],
                       IdGroup_temp[CV(),CV()])<1
 30                     THEN IdGroup_temp[CV(), CV()]
 31                     END
 32               ELSE  IdGroup_temp[CV(),CV()]
 33           END
 34     END  ,
 35   ------------------------name_group
 36     name_group[ANY, ANY] ORDER BY  name, id =
 37     CASE WHEN flag[CV(), CV()]  = 0
 38          THEN  CASE WHEN abs(time_shift[CV(),CV()]- 
                             nvl(time_shift[CV(),CV()-1],
                             time_shift[CV(),CV()]))<= 1
 39                          AND instr(name_group[CV(), CV()-1] , 
                             STATE[CV() ,CV()])<1
 40                     THEN name_group[CV(),CV()-1]|| ',' ||STATE[CV(),CV()]
 41                     ELSE STATE[CV(),CV()]
 42                 END
 43      END,
 44      ------------------------  IdGroup_temp
 45     IdGroup_temp[ANY, ANY] ORDER BY  name, id =
 46     CASE WHEN flag[CV(), CV()] = 0
 47          THEN CASE WHEN abs(time_shift[CV(),CV()]- 
                  nvl(time_shift[CV(),CV()-1], time_shift[CV(),CV()]))<= 1
 48                    AND  instr (name_group[CV(),CV()-1], STATE[CV(),CV()])< 1
 49                    THEN IdGroup_temp[CV() , CV()-1]|| 
                        ',' || auto_id[CV(), CV()]
 50                    ELSE auto_id[CV(),CV()]
 51               END
 52      END ,
 53     ------------------------  Sum_time
 54     sum_time[ANY, ANY] ORDER BY name, id =
 55     CASE WHEN flag[CV(), CV() ]  = 0
 56           THEN CASE WHEN abs(time_shift[CV(),CV()]- 
          nvl(time_shift[CV(),CV()-1],time_shift[CV(),CV()]))<= 1
 57                 AND instr(name_group[ CV(),CV()-1], STATE[CV(),CV()] )  < 1
 58                    THEN  sum_time[ CV() , CV()-1]  + time_shift[CV(),CV()]
 59                    ELSE  time_shift[CV(),CV()]
 60                END
 61          ELSE sum_time[CV(),CV()]
 62     END ,
 63    -------------------flag 
 64    flag[ANY, ANY] = 1
 65    )
 66   )
 67 )
 68  WHERE  Auto_id_Group IS NOT NULL
 69  ORDER BY  name ;

AUTO_ID_GROUP  AVG_TIME_SHIFT  NAME                                                             
-------------- --------------- -----                                             
1,2,3                    12.8  a                                                                
4,5                      18.5  a                                                                                                              
6,7                        22  a                                                                                                           
8,9                       8.5  b                                                                                                              
10                          1  c                                                                                                              
11                          2  c                                                                              
12                          3  c                                                                                                              
13                          4  c                                                                
                                                                                
8 rows selected.

SQL> spool off
 

Backwards Search?

V, February 21, 2007 - 10:46 am UTC

Is there anyway to find the unique datafile locations using instr?
i.e. /mt1/oradata/TEST/system.dbf
/mt1/oraind/TEST/index.dbf

and only return
/mt1/oradata
/mt1/oraind
Tom Kyte
February 21, 2007 - 11:10 am UTC

why isn't TEST part of this unique location??

TEST

V, February 21, 2007 - 11:23 am UTC

It can be part of the Unique set
Tom Kyte
February 21, 2007 - 12:25 pm UTC

that makes no sense - please explain a tad more verbosely

I see no logic behind removing 'TEST' from the path.

TEST

V, February 21, 2007 - 1:24 pm UTC

Basically I am trying to run a query that would find all the unique mount points from v$datafile, v$logfile, v$controlfile so I can have a shell script that can copy the mount points.

I think this works:
select unique substr(name,1,instr(name,'/',-10,1)-1) "Unique Mount Pts" from v$datafile
Tom Kyte
February 21, 2007 - 3:09 pm UTC

define mount point

/x/y/TEST could be a mount point

/x could be the mount point

/x/y could be the mount point


You need to sort of define what you want - mount point doesn't get us there yet.

To: V

Michel Cadot, February 22, 2007 - 7:30 am UTC


It seems you want the greatest common path of all your files.
So, first search all files, remove file name to keep only distinct paths.
Then generates all subpaths, keep only those that are common and then select the longest one.

In SQL:
SQL> with
  2    path as (
  3      select substr(name,1,instr(name,'\',-1)) path from v$datafile
  4      union
  5      select substr(name,1,instr(name,'\',-1)) path from v$tempfile
  6      union
  7      select substr(member,1,instr(member,'\',-1)) path from v$logfile
  8      union
  9      select substr(name,1,instr(name,'\',-1)) path from v$controlfile
 10    ),
 11    allsub as (
 12      select path, 
 13             substr(path, 1, instr(path,'\', 1, level)-1) subpath,
 14             count(*) over 
 15               (partition by substr(path, 1, instr(path,'\', 1, level))) subcnt,
 16             count(distinct path) over () nbpath
 17      from path
 18      connect by 
 19            prior path = path
 20        and prior dbms_random.value is not null
 21        and instr(path,'\', 1, level) > 0
 22    ),
 23    eligible as (
 24      select subpath, length(subpath) lg,
 25             max(length(subpath)) over () maxlg
 26      from allsub
 27      where subcnt = nbpath
 28    )
 29  select subpath "Common subpath"
 30  from eligible
 31  where lg = maxlg and rownum = 1
 32  /
Common subpath
------------------------------
C:\ORACLE

1 row selected.

For Unix, change '\' to '/'.

Regards
Michel

To: V

Michel Cadot, March 01, 2007 - 2:00 am UTC


Rereading your question I think I misunderstood it, what you want is all the paths when you remove the largest common end.
Just a slight modification in the previous query:
SQL> select substr(name,1,instr(name,'\',-1)) path from v$datafile
  2  union
  3  select substr(name,1,instr(name,'\',-1)) path from v$tempfile
  4  union
  5  select substr(member,1,instr(member,'\',-1)) path from v$logfile
  6  union
  7  select substr(name,1,instr(name,'\',-1)) path from v$controlfile
  8  /
PATH
------------------------------
C:\ORACLE\ARCHIVES\MIKA\
C:\ORACLE\BASES\MIKA\
C:\ORACLE\FLASH\MIKA\

3 rows selected.

SQL> with
  2    path as (
  3      select substr(name,1,instr(name,'\',-1)) path from v$datafile
  4      union
  5      select substr(name,1,instr(name,'\',-1)) path from v$tempfile
  6      union
  7      select substr(member,1,instr(member,'\',-1)) path from v$logfile
  8      union
  9      select substr(name,1,instr(name,'\',-1)) path from v$controlfile
 10    ),
 11    allsub as (
 12      select path, 
 13             substr(path, instr(path,'\', -1, level)) subpath,
 14             count(*) over 
 15               (partition by substr(path, instr(path,'\', -1, level))) subcnt,
 16             count(distinct path) over () nbpath
 17      from path
 18      connect by     prior path = path
 19                 and prior dbms_random.value is not null
 20                 and instr(path,'\', -1, level) > 0
 21    ),
 22    eligible as (
 23      select path, subpath, length(subpath) lg,
 24             max(length(subpath)) over () maxlg
 25      from allsub
 26      where subcnt = nbpath
 27     )
 28  select distinct substr(path,1,instr(path,subpath,-1)) "Uncommon subpath"
 29  from eligible
 30  where lg = maxlg
 31  order by 1
 32  /
Uncommon subpath
------------------------------------------------------------------------------------
C:\ORACLE\ARCHIVES\
C:\ORACLE\BASES\
C:\ORACLE\FLASH\

3 rows selected.

Regards
Michel

shift query

vijay, August 31, 2011 - 1:47 pm UTC

nice post

I have a query. I have shift tables.
it has different shift with start and end time in hh24 format.

IDX_SM_ID_PK NUMBER(4,0)
SM_NAME VARCHAR2(50)
SM_START_TIME VARCHAR2(8)
SM_END_TIME VARCHAR2(8)




- I need to fetch current shift, previous shift and next shift.





Tom Kyte
August 31, 2011 - 2:18 pm UTC

no create
no inserts
no look

no promises either - You'll need to explain yourself A LOT better. I have no clue, none, what

"- I need to fetch current shift, previous shift and next shift."

means.

Regarding dxl's question

Jichao Li, August 31, 2011 - 11:33 pm UTC

Tom, dxl,

Regarding dxl's question at bookmark http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:4273679444401#19727549535105
The query below almost does the work but with a little sacrifice that some output rows have a few more characters than requested.

Insert into T1 (C1,C2) values ('aa','abcd');
Insert into T1 (C1,C2) values ('aa','efgh');
Insert into T1 (C1,C2) values ('aa','ijkl');
Insert into T1 (C1,C2) values ('aa','mnop');
Insert into T1 (C1,C2) values ('aa','qrst');
Insert into T1 (C1,C2) values ('aa','uvwx');
Insert into T1 (C1,C2) values ('aa','yz');
Insert into T1 (C1,C2) values ('bb','1234');
Insert into T1 (C1,C2) values ('bb','5678');
Insert into T1 (C1,C2) values ('bb','910');
Insert into T1 (C1,C2) values ('bb','1112');
Insert into T1 (C1,C2) values ('bb','1314');
Insert into T1 (C1,C2) values ('aa','abcd');
Insert into T1 (C1,C2) values ('aa','efgh');
Insert into T1 (C1,C2) values ('aa','ijkl');
Insert into T1 (C1,C2) values ('aa','mnop');
Insert into T1 (C1,C2) values ('aa','qrst');
Insert into T1 (C1,C2) values ('aa','uvwx');
Insert into T1 (C1,C2) values ('aa','yz');
Insert into T1 (C1,C2) values ('bb','1234');
Insert into T1 (C1,C2) values ('bb','5678');
Insert into T1 (C1,C2) values ('bb','910');
Insert into T1 (C1,C2) values ('bb','1112');
Insert into T1 (C1,C2) values ('bb','1314');
Insert into T1 (C1,C2) values ('aa','abcd');
Insert into T1 (C1,C2) values ('aa','efgh');
Insert into T1 (C1,C2) values ('aa','ijkl');
Insert into T1 (C1,C2) values ('aa','mnop');
Insert into T1 (C1,C2) values ('aa','qrst');
Insert into T1 (C1,C2) values ('aa','uvwx');
Insert into T1 (C1,C2) values ('aa','yz');
Insert into T1 (C1,C2) values ('bb','1234');
Insert into T1 (C1,C2) values ('bb','5678');
Insert into T1 (C1,C2) values ('bb','910');
Insert into T1 (C1,C2) values ('bb','1112');
Insert into T1 (C1,C2) values ('bb','1314');
Insert into T1 (C1,C2) values ('aa','abcd');
Insert into T1 (C1,C2) values ('aa','efgh');
Insert into T1 (C1,C2) values ('aa','ijkl');
Insert into T1 (C1,C2) values ('aa','mnop');
Insert into T1 (C1,C2) values ('aa','qrst');
Insert into T1 (C1,C2) values ('aa','uvwx');
Insert into T1 (C1,C2) values ('aa','yz');
Insert into T1 (C1,C2) values ('bb','1234');
Insert into T1 (C1,C2) values ('bb','5678');
Insert into T1 (C1,C2) values ('bb','910');
Insert into T1 (C1,C2) values ('bb','1112');
Insert into T1 (C1,C2) values ('bb','1314');
Insert into T1 (C1,C2) values ('aa','abcd');
Insert into T1 (C1,C2) values ('aa','efgh');
Insert into T1 (C1,C2) values ('aa','ijkl');
Insert into T1 (C1,C2) values ('aa','mnop');
Insert into T1 (C1,C2) values ('aa','qrst');
Insert into T1 (C1,C2) values ('aa','uvwx');
Insert into T1 (C1,C2) values ('aa','yz');
Insert into T1 (C1,C2) values ('bb','1234');
Insert into T1 (C1,C2) values ('bb','5678');
Insert into T1 (C1,C2) values ('bb','910');
Insert into T1 (C1,C2) values ('bb','1112');
Insert into T1 (C1,C2) values ('bb','1314');
Insert into T1 (C1,C2) values ('aa','abcd');
Insert into T1 (C1,C2) values ('aa','efgh');
Insert into T1 (C1,C2) values ('aa','ijkl');
Insert into T1 (C1,C2) values ('aa','mnop');
Insert into T1 (C1,C2) values ('aa','qrst');
Insert into T1 (C1,C2) values ('aa','uvwx');
Insert into T1 (C1,C2) values ('aa','yz');
Insert into T1 (C1,C2) values ('bb','1234');
Insert into T1 (C1,C2) values ('bb','5678');
Insert into T1 (C1,C2) values ('bb','910');
Insert into T1 (C1,C2) values ('bb','1112');
Insert into T1 (C1,C2) values ('bb','1314');

SELECT c1,
       ltrim(MAX(catstr), '~') AS output,
       length(ltrim(MAX(catstr), '~')) AS output_len
  FROM (SELECT c1,
               group_id,
               sys_connect_by_path(c2, '~') AS catstr
          FROM (SELECT c1,
                       c2,
                       row_id,
                       lag(row_id) over(PARTITION BY c1, group_id ORDER BY row_id) AS prev_rid,
                       group_id
                  FROM (SELECT c1,
                               c2,
                               floor((cum_len - 1) / 20) AS group_id,
                               row_id
                          FROM (SELECT c1,
                                       c2,
                                       ROWID AS row_id,
                                       SUM(length(c2) + 1) over(PARTITION BY c1 ORDER BY ROWID rows BETWEEN unbounded preceding AND CURRENT ROW) AS cum_len
                                  FROM t1)))
         START WITH prev_rid IS NULL
        CONNECT BY PRIOR row_id = prev_rid)
 GROUP BY c1,
          group_id
 ORDER BY c1,
          group_id;

C1    OUTPUT                    OUTPUT_LEN
----- ------------------------- ----------
aa    abcd~efgh~ijkl~mnop               19
aa    qrst~uvwx~yz~abcd                 17
aa    efgh~ijkl~mnop~qrst               19
aa    uvwx~yz~abcd~efgh                 17
aa    ijkl~mnop~qrst~uvwx~yz            22
aa    abcd~efgh~ijkl~mnop               19
aa    qrst~uvwx~yz~abcd                 17
aa    efgh~ijkl~mnop~qrst               19
aa    uvwx~yz~abcd~efgh~ijkl            22
aa    mnop~qrst~uvwx~yz                 17
bb    1234~5678~910~1112                18
bb    1314~1234~5678~910                18
bb    1112~1314~1234~5678               19
bb    910~1112~1314~1234                18
bb    5678~910~1112~1314                18
bb    1234~5678~910~1112~1314           23
bb    1234~5678~910~1112                18
bb    1314                               4

18 rows selected.

Tom Kyte
September 01, 2011 - 8:15 am UTC

Nice try! Excellent

if you change your sum to:

SUM(length(c2)+1) over(PARTITION BY c1 ORDER BY ROWID rows BETWEEN unbounded preceding AND 1 following)


it'll build strings that might not be as perfectly long as they *could* be - but all will less then or equal to the maximum length.

An exact solution

Michel Cadot, September 01, 2011 - 7:58 am UTC


SQL> def lg=20
SQL> col val format a&lg
SQL> col c1 format a2
SQL> with
  2    data as (
  3      select c1, c2,
  4             lead (c2) over (partition by c1 order by c2, rowid) next_c2,
  5             row_number() over (partition by c1 order by c2, rowid) rn
  6      from t1
  7    ),
  8    compute (c1, val, rn, isleaf) as (
  9      select c1, c2, rn,
 10             case
 11               when next_c2 is null then 1
 12               when length(c2)+length(next_c2)+1 > &lg then 1
 13               else 0
 14             end
 15      from data
 16      where rn = 1
 17      union all
 18      select d.c1, decode(c.isleaf, 0, c.val||'-') || d.c2, d.rn,
 19             case
 20               when d.next_c2 is null then 1
 21               when c.isleaf = 1 and length(d.c2)+length(d.next_c2)+1 > &lg then 1
 22               when c.isleaf = 0 and length(c.val)+length(d.c2)+length(d.next_c2)+2 > &lg
 23                 then 1
 24               else 0
 25             end
 26      from compute c, data d
 27      where d.c1 = c.c1 and d.rn = c.rn + 1
 28    )
 29  select c1, val
 30  from compute
 31  where isleaf = 1
 32  order by c1, rn
 33  /
C1 VAL
-- --------------------
aa abcd-abcd-abcd-abcd
aa abcd-abcd-efgh-efgh
aa efgh-efgh-efgh-efgh
aa ijkl-ijkl-ijkl-ijkl
aa ijkl-ijkl-mnop-mnop
aa mnop-mnop-mnop-mnop
aa qrst-qrst-qrst-qrst
aa qrst-qrst-uvwx-uvwx
aa uvwx-uvwx-uvwx-uvwx
aa yz-yz-yz-yz-yz-yz
bb 1112-1112-1112-1112
bb 1112-1112-1234-1234
bb 1234-1234-1234-1234
bb 1314-1314-1314-1314
bb 1314-1314-5678-5678
bb 5678-5678-5678-5678
bb 910-910-910-910-910
bb 910


Regards
Michel

Tom Kyte
September 01, 2011 - 8:23 am UTC

nicer ;)

An exact solution

Michel Cadot, September 01, 2011 - 8:00 am UTC


SQL> def lg=20
SQL> col val format a&lg
SQL> col c1 format a2
SQL> with
  2    data as (
  3      select c1, c2,
  4             lead (c2) over (partition by c1 order by c2, rowid) next_c2,
  5             row_number() over (partition by c1 order by c2, rowid) rn
  6      from t1
  7    ),
  8    compute (c1, val, rn, isleaf) as (
  9      select c1, c2, rn,
 10             case
 11               when next_c2 is null then 1
 12               when length(c2)+length(next_c2)+1 > &lg then 1
 13               else 0
 14             end
 15      from data
 16      where rn = 1
 17      union all
 18      select d.c1, decode(c.isleaf, 0, c.val||'-') || d.c2, d.rn,
 19             case
 20               when d.next_c2 is null then 1
 21               when c.isleaf = 1 and length(d.c2)+length(d.next_c2)+1 > &lg then 1
 22               when c.isleaf = 0 and length(c.val)+length(d.c2)+length(d.next_c2)+2 > &lg
 23                 then 1
 24               else 0
 25             end
 26      from compute c, data d
 27      where d.c1 = c.c1 and d.rn = c.rn + 1
 28    )
 29  select c1, val
 30  from compute
 31  where isleaf = 1
 32  order by c1, rn
 33  /
C1 VAL
-- --------------------
aa abcd-abcd-abcd-abcd
aa abcd-abcd-efgh-efgh
aa efgh-efgh-efgh-efgh
aa ijkl-ijkl-ijkl-ijkl
aa ijkl-ijkl-mnop-mnop
aa mnop-mnop-mnop-mnop
aa qrst-qrst-qrst-qrst
aa qrst-qrst-uvwx-uvwx
aa uvwx-uvwx-uvwx-uvwx
aa yz-yz-yz-yz-yz-yz
bb 1112-1112-1112-1112
bb 1112-1112-1234-1234
bb 1234-1234-1234-1234
bb 1314-1314-1314-1314
bb 1314-1314-5678-5678
bb 5678-5678-5678-5678
bb 910-910-910-910-910
bb 910

Regards
Michel

Modelification

Brendan, September 01, 2011 - 3:19 pm UTC

Yes, neat.

I took elements of Michel's solution and modelified thus (adding 3 extra test records):
SELECT c1, wrap
  FROM (
SELECT c1, c2, wrap, isleaf, rn
  FROM t1
 MODEL
    PARTITION BY (c1)
    DIMENSION BY (Row_Number() OVER (PARTITION BY c1 ORDER BY c2, ROWID) rn)
    MEASURES (c2, c2 wrap, 0 isleaf)
    RULES (
       wrap[rn>1]   = CASE WHEN Length(c2[CV()]) + 1 + Length(wrap[CV()-1]) > &lg THEN c2[CV()] 
                           ELSE wrap[CV()-1] || '~' || c2[CV()] END,
       isleaf[rn=1] = PresentV (c2[CV()+1], 
                           CASE WHEN Length(wrap[CV()]) + 1 + Length(c2[CV()+1]) > &lg THEN 1 END, 1),
       isleaf[rn>1] = PresentV (c2[CV()+1], 
                           CASE WHEN Length(wrap[CV()]) + 1 + Length(c2[CV()+1]) > &lg THEN 1 END, 1)
    )
)
 WHERE isleaf = 1
 ORDER BY c1, rn
 
C1    WRAP
----- --------------------
aa    abcd~abcd~abcd~abcd
aa    abcd~abcd~efgh~efgh
aa    efgh~efgh~efgh~efgh
aa    ijkl~ijkl~ijkl~ijkl
aa    ijkl~ijkl~mnop~mnop
aa    mnop~mnop~mnop~mnop
aa    qrst~qrst~qrst~qrst
aa    qrst~qrst~uvwx~uvwx
aa    uvwx~uvwx~uvwx~uvwx
aa    yz~yz~yz~yz~yz~yz
bb    1112~1112~1112~1112
bb    1112~1112~1234~1234
bb    1234~1234~1234~1234
bb    1314~1314~1314~1314
bb    1314~1314~5678~5678
bb    5678~5678~5678~5678
bb    910~910~910~910~910
bb    910
cc    1314
dd    1234567890
dd    ABCDEFGHIJ

21 rows selected.

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |       |       |     5 (100)|          |
|   1 |  SORT ORDER BY        |      |    75 |  9750 |     5  (40)| 00:00:01 |
|*  2 |   VIEW                |      |    75 |  9750 |     4  (25)| 00:00:01 |
|   3 |    SQL MODEL ORDERED  |      |    75 |  8700 |     4  (25)| 00:00:01 |
|   4 |     WINDOW SORT       |      |    75 |  8700 |     4  (25)| 00:00:01 |
|   5 |      TABLE ACCESS FULL| T1   |    75 |  8700 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("ISLEAF"=1)

Note
-----
   - dynamic sampling used for this statement

Ouch, a line too many...

Brendan, September 01, 2011 - 3:30 pm UTC

SELECT c1, wrap
FROM (
SELECT c1, c2, wrap, isleaf, rn
FROM t1
MODEL
PARTITION BY (c1)
DIMENSION BY (Row_Number() OVER (PARTITION BY c1 ORDER BY c2, ROWID) rn)
MEASURES (c2, c2 wrap, 0 isleaf)
RULES (
wrap[rn>1] = CASE WHEN Length(c2[CV()]) + 1 + Length(wrap[CV()-1]) > &lg THEN c2[CV()]
ELSE wrap[CV()-1] || '~' || c2[CV()] END,
isleaf[ANY] = PresentV (c2[CV()+1],
CASE WHEN Length(wrap[CV()]) + 1 + Length(c2[CV()+1]) > &lg THEN 1 END, 1)
)
)
WHERE isleaf = 1
ORDER BY c1, rn

Comparing exact solutions

Stew Ashton, September 01, 2011 - 3:32 pm UTC

>with
data as (
select c1, c2,
lead (c2) over (partition by c1 order by c2, rowid) next_c2,
row_number() over (partition by c1 order by c2, rowid) rn
from t1
),
compute (c1, val, rn, isleaf) as (
select c1, c2, rn,
case
when next_c2 is null then 1
when length(c2)+length(next_c2)+1 > 20 then 1
else 0
end
from data
where rn = 1
union all
select d.c1, decode(c.isleaf, 0, c.val||'-') || d.c2, d.rn,
case
when d.next_c2 is null then 1
when c.isleaf = 1 and length(d.c2)+length(d.next_c2)+1 > 20 then 1
when c.isleaf = 0 and length(c.val)+length(d.c2)+length(d.next_c2)+2 > 20
then 1
else 0
end
from compute c, data d
where d.c1 = c.c1 and d.rn = c.rn + 1
)
select c1, val
from compute
where ISLEAF = 1
order by C1, RN;

DBMS_XPLAN.DISPLAY_CURSOR output:
---------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | A-Rows | Buffers |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 18 | 301 |
| 1 | SORT ORDER BY | | 1 | 18 | 301 |
|* 2 | VIEW | | 1 | 18 | 301 |
| 3 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | 1 | 72 | 301 |
|* 4 | VIEW | | 1 | 2 | 7 |
| 5 | WINDOW SORT | | 1 | 72 | 7 |
| 6 | TABLE ACCESS FULL | T1 | 1 | 72 | 7 |
|* 7 | HASH JOIN | | 42 | 70 | 294 |
| 8 | RECURSIVE WITH PUMP | | 42 | 72 | 0 |
| 9 | VIEW | | 42 | 3024 | 294 |
| 10 | WINDOW SORT | | 42 | 3024 | 294 |
| 11 | TABLE ACCESS FULL | T1 | 42 | 3024 | 294 |
---------------------------------------------------------------------------------------

>select c1, val from (
select * from T1
model
partition by (C1)
dimension by (ROW_NUMBER() over(partition by C1 order by C2) RN)
measures (c2, cast(C2 as varchar2(4000)) val, 0 flag)
rules (
VAL[RN>1] order by RN = case when length(VAL[CV()-1]||'/'||C2[CV()]) > 20
then C2[CV()] else val[CV()-1]||'/'||C2[CV()] end,
FLAG[any] order by RN = case when length(VAL[CV()]||'/'||C2[CV()+1]) > 20
or C2[CV()+1] is null then 1 end
)
)
where FLAG = 1 order by C1, RN;

DBMS_XPLAN.DISPLAY_CURSOR output:
-------------------------------------------------------------------------------
| Id | Operation | Name | Starts | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 18 |00:00:00.01 | 7 |
| 1 | SORT ORDER BY | | 1 | 18 |00:00:00.01 | 7 |
|* 2 | VIEW | | 1 | 18 |00:00:00.01 | 7 |
| 3 | SQL MODEL ORDERED | | 1 | 72 |00:00:00.01 | 7 |
| 4 | WINDOW SORT | | 1 | 72 |00:00:00.01 | 7 |
| 5 | TABLE ACCESS FULL| T1 | 1 | 72 |00:00:00.01 | 7 |
-------------------------------------------------------------------------------

Unbelievable!

Stew Ashton, September 01, 2011 - 3:34 pm UTC


Brendan, you beat me to it! I should have known ;)

@Stew

Brendan, September 01, 2011 - 4:08 pm UTC

Ha ha, I was a bit worried you'd get in first :) - I waited to get home to post. How do you get the buffers to output from the XPLAN?

Getting buffers from XPLAN

Stew Ashton, September 01, 2011 - 5:34 pm UTC

alter session set STATISTICS_LEVEL='ALL';
or use the hint /*+ gather_plan_statistics */

Getting buffers from XPLAN: one more thing

Stew Ashton, September 02, 2011 - 2:56 am UTC


Oh yes, in the format options use 'IOSTATS LAST' or 'ALLSTATS LAST'.

Superb Michel and Brendan

Jichao Li, September 07, 2011 - 2:30 am UTC

Both Michel and Brendan have provided superb solutions! Michel introduced the new recursive query feature in 11g R2. And Brendan showed us the power of model clause, an area where many DBAs, at least me :), might have never entered.

Salute!
Jichao Li

Answer to dxl's question using 12c MATCH_RECOGNIZE

Stew Ashton, September 27, 2013 - 9:20 am UTC


When 12c starts being deployed, we should all be ready with the MATCH_RECOGNIZE clause. I am applying it to all sorts of problems that required either multiple inline analytic functions or the MODEL clause or recursive subqueries.

I am using the insert statements from http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:4273679444401#3873689100346951135
SELECT c1, listagg(c2, '/') within group(order by C2) val
FROM t1
MATCH_RECOGNIZE (
  PARTITION BY C1
  ORDER BY C2
  MEASURES match_number() grp
  ALL ROWS PER MATCH
  PATTERN (A b*)
  DEFINE b AS length(a.c2) + sum(length(b.c2)+1) <= 20
)
GROUP BY C1, grp
order by c1, grp;

C1 VAL
-- --------------------
aa abcd/abcd/abcd/abcd
aa abcd/abcd/efgh/efgh  
aa efgh/efgh/efgh/efgh  
aa ijkl/ijkl/ijkl/ijkl  
aa ijkl/ijkl/mnop/mnop  
aa mnop/mnop/mnop/mnop  
aa qrst/qrst/qrst/qrst  
aa qrst/qrst/uvwx/uvwx  
aa uvwx/uvwx/uvwx/uvwx  
aa yz/yz/yz/yz/yz/yz    
bb 1112/1112/1112/1112  
bb 1112/1112/1234/1234  
bb 1234/1234/1234/1234  
bb 1314/1314/1314/1314  
bb 1314/1314/5678/5678  
bb 5678/5678/5678/5678  
bb 910/910/910/910/910  
bb 910
This would be even cooler if LISTAGG() were supported in the MEASURES clause...

Answer to original question using 12c MATCH_RECOGNIZE

Stew Ashton, September 27, 2013 - 9:38 am UTC


Compare this to the MODEL clause solution: http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:4273679444401#76833955117389
SELECT listagg(auto_id, ',') WITHIN GROUP (ORDER BY auto_id) auto_id_group,
avg(time_shift) avg_time_shift,
name
FROM t
MATCH_RECOGNIZE (
  partition by name
  ORDER BY auto_id
  MEASURES match_number() grp
  ALL ROWS PER MATCH
  PATTERN (A b*)
  DEFINE b AS b.state != prev(b.state) AND abs(b.time_shift - prev(b.time_shift)) <= 1
)
GROUP BY name, grp
order by name, grp;

AUTO_ID_GROUP AVG_TIME_SHIFT NAME
------------- -------------- ----
1,2,3                   12.8 a    
4,5                     18.5 a    
6,7                       22 a    
8,9                      8.5 b    
10                         1 c    
11                         2 c    
12                         3 c    
13                         4 c

MODEL and MATCH_RECOGNIZE

Parthiban Nagarajan, October 04, 2013 - 6:19 am UTC

Hi Tom

I understand that PL/SQL is a procedural extension to SQL.
And I see MODEL and MATCH_RECOGNIZE clauses also to be very similar (extensions to SQL).
But why they are classified as SQL, I wonder.
May I know what is your opinion on this?

I have to admit that they look more complex than normal SQL and hence I couldn't learn them.
It seems that I cannot say "I know Oracle SQL".

Thanks and regards

Question to Answer to dxl's question using 12c MATCH_RECOGNIZE :)

Daniel Ramos, August 20, 2014 - 8:13 am UTC

Hi all.

In the Answer to dxl's question using 12c MATCH_RECOGNIZE you are using:

PATTERN (A b*)
DEFINE b AS length(a.c2) + sum(length(b.c2)+1) <= 20

Why do you need "A"?

I think they can be:

PATTERN (b*)
DEFINE b AS sum(length(b.c2)+1) <= 20
or
PATTERN (b+)
DEFINE b AS sum(length(b.c2)+1) <= 20

if you want get at least one...

Where are my mistake?

Thanks

Please respond

Asim, August 08, 2022 - 1:14 pm UTC

I agree to some extent with partibhan nagarajans comment on 4th october 2013, can any body please comment on this. Thanks
Chris Saxon
August 08, 2022 - 1:29 pm UTC

What exactly is it you want to know? I'm not sure what the confusion is.

MATCH_RECOGNIZE and MODEL are part of the SQL language - you can find their syntax in the SQL reference https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/index.html

More

A reader, August 08, 2022 - 1:56 pm UTC


Once I was reading somewhere that the SQL was made with the intention of it being for non technical business users, so it can be used by them to write queries simply telling what to do in almost human language (4GL) and not how to do (3GL). Initial SQL with few simple syntaxes (but yes limited functionality) was very simple and served the purpose it was made for. But with the passage of time, new extentions and so much syntaxes were added, RECURSIVE WITH, INLINE VIEWS, MODEL, MATCH_RECOGNIZE , Scalar queries, Lateral join, Analytic functions, GROUPING extentions, etc. (which are very good and add useful functionality), but they are so much and with so much complexity that it is no more possible for non technical users to use them. Can you imagine a non technical business user writing and understanding queries with MODEL, MATCH_RECOGNIZE RECURSIVE WITH etc.? Even a pure computer science technical user finds them difficult initially. So, I think SQL has now became at least 3.5GL, also because the syntax may still seem 4GL, but the mind of the user writing the query is thinking in procedural (3GL) way to solve a business problem/query.

This is my current opinion/feeling, which may be completely wrong, therefore I request your comments please.

Chris Saxon
August 08, 2022 - 4:33 pm UTC

It's true many of the extensions are complex. With most SQL you're still stating what, not how and I'd argue non-technical people still have a better chance of understanding SQL compared to equivalent Java/C/JS/... code.

There is no strict, agreed-on definition of what makes a language 3GL vs 4GL that I know of, so you could argue whichever way you want on this.

Does calling SQL 3.5GL vs 4GL make any practical difference? Not that I can think of.

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library