Home>Question Details



Philip -- Thanks for the question regarding "STRAGG", version 9.0.2

Submitted on 14-Feb-2004 21:08 Central time zone
Last updated 29-Sep-2009 8:12

You Asked

Hi Tom,

Need your help here. 

Stage table:

Customer    Product    Order Date    Operator
--------        ------------------      ----------
C1        P1        D1        OP1
C1        P2        D2        OP2
C1        P3        D3        OP3
C1        P1        D4        OP3

C2        P2        D1        OP2
C2        P3        D2        OP4

C3        P1        D2        OP1
C3        P1        D3        OP1

I would like the results to look like the following (ordered by product name):

Output table:

Customer        Product_Str     Order_Date_Str  Operator_Str
--------        -----------     --------------  ------------
C1        P1,P1,P2,P3    D1,D4,D2,D3    OP1,OP3,OP2,OP4
C2        P2,P3        D1,D2        OP2,OP4
C3        P1,P1        D2,D3        OP1,OP1

There are about 14 millions customers and the number of orders per customer are infinite.

I was able to use STRAGG function to build the string fields fast (less than an hour).  
But the product name is not in order and these three fileds are out of sync. I know “Set” 
has no order concept.  

Is there any other way to construct this?   I have tried to use a function but it’s way 
too slow.  I have also tried to use analytic function row_number() over… but since the 
order number is not known…


Create table stage 
(customer varchar(2),
 product    varchar(2),
 order_date varhcar2(2),
 operator   varchar2(9));

insert into stage values (‘C1’,’P1’,’D1’,’OP1’);
insert into stage values (‘C1’,’P2’,’D2’,’OP2’);
insert into stage values (‘C1’,’P3’,’D3’,’OP3’);
insert into stage values (‘C1’,’P1’,’D4’,’OP3’);
insert into stage values (‘C2’,’P2’,’D1’,’OP2’);
insert into stage values (‘C2’,’P3’,’D2’,’OP4’);
insert into stage values (‘C3’,’P1’,’D2’,’OP1’);
insert into stage values (‘C3’,’P1’,’D3’,’OP1’);

commit;

create table output
(customer varchar2(2),
 product_str varchar2(100),
 order_date_str varchar2(100),
operator_str   varchar2(100));

SQL>Insert into output 
Select customer, stragg(product) ,stragg(order_date), stragg(operator)
From stage
Group by customer;

Thank you!

=================

Tom,

On average the products ordered by a customer is 18. But it could be as high as 40 and as 
low as 1.

a) the three fields need to be correlated -- eg: sort order_date and operator 
by product and string them in that order. 

==> The three fields need to be correlated


b) are the three fields "unique" in the customer.  eg: will the same product appear over 
and over

==> The product can be repeated over and over again. So can order date and operator ID.

Thanks!






 

and we said...

I would use row_number() and pivot then.  It will be faster then modifying stragg to 
procedurally "sort" the correlated data with duplicates.


ops$tkyte@ORA920PC> declare
  2          l_stmt long;
  3  begin
  4          l_stmt := 'insert into output select customer, rtrim( 
max(decode(rn,1,product)) ';
  5      for i in 2 .. 40
  6          loop
  7                  l_stmt := l_stmt || ' || '','' || max(decode( rn, ' || i || ', 
product ))';
  8          end loop;
  9          l_stmt := l_stmt || ', '','' ), rtrim( max(decode(rn,1,order_date)) ';
 10      for i in 2 .. 40
 11          loop
 12                  l_stmt := l_stmt || ' || '','' || max( decode( rn, ' || i || ', 
order_date ))';
 13          end loop;
 14          l_stmt := l_stmt || ', '','' ), rtrim( max(decode(rn,1,operator)) ';
 15      for i in 2 .. 40
 16          loop
 17                  l_stmt := l_stmt || ' || '','' || max( decode( rn, ' || i || ', 
operator ))';
 18          end loop;
 19          l_stmt := l_stmt || ', '','' ) from (select stage.*, row_number() over 
(partition by customer order by rowid) rn from stage)';
 20          l_stmt := l_stmt || ' group by customer';
 21
 22          execute immediate l_stmt;
 23  end;
 24  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC> select * from output;
 
CU PRODUCT_STR
-- 
------------------------------------------------------------------------------------------
----------
ORDER_DATE_STR
------------------------------------------------------------------------------------------
----------
OPERATOR_STR
------------------------------------------------------------------------------------------
----------
C1 P1,P2,P3,P1
D1,D2,D3,D4
OP1,OP2,OP3,OP3
 
C2 P2,P3
D1,D2
OP2,OP4
 
C3 P1,P1
D2,D3
OP1,OP1
 

If you are worried about "having 41" someday, just add:


 l_stmt := l_stmt || ' where decode( rn, 41, 1/0 ) is null group by customer';


that way, the statement will FAIL if there are 41 -- letting you know -- so you can 
increase the size of the "loop"

 

Reviews    
5 stars Product String field is not sorted   February 16, 2004 - 9am Central time zone
Reviewer: Philip Kuan from Fairfax, VA USA
Tom,

Sorry if I did not make myself clear. The product_str field needs to be sorted by product name.  

So the output for C1 should be P1,P1,P2,P3. And sort order_date and operator by product and string 
them in that order.

Thank you.
 


Followup   February 16, 2004 - 10am Central time zone:

sorry -- order by product_name, rowid


 (partition by customer order by PRODUCT_NAME, rowid)


 

5 stars Excellent   February 16, 2004 - 8pm Central time zone
Reviewer: Philip Kuan from Fairfax, VA, USA
Tom,

The solution works really well.  Thank you so much! 


3 stars Good   March 14, 2004 - 10pm Central time zone
Reviewer: James from Oregon,USA
Dear Sir,
How to create our own operator using the command
 sql> create operator op1 ..
Do you have a demo for this?
Please do reply.
 


Followup   March 15, 2004 - 7am Central time zone:

No, i do not -- but the documentation does:

http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96595/toc.htm
 

4 stars user defined analytic function   March 18, 2004 - 7am Central time zone
Reviewer: Nopparat V. from Thailand
It will be a good idea if we can create a user defined analytic function in the same way we did 
with a user defined aggregate function. (Oracle should consider its possibility)

I have modified your StrAgg function to use in case sorting is required.

create or replace type stringOrder as object (
TheString  varchar2(10) ,
TheOrder   number(2) ) ;
/
create or replace type StringAggType2 as object
(
   theString varchar2(4000),
   MaxOrder  number(2),
   static function
        ODCIAggregateInitialize(sctx IN OUT StringAggType2 )
        return number,
   member function
        ODCIAggregateIterate(self IN OUT StringAggType2 ,
                             value IN StringOrder )
        return number,
   member function
        ODCIAggregateTerminate(self IN StringAggType2,
                               returnValue OUT  varchar2,
                               flags IN number)
        return number,
   member function
        ODCIAggregateMerge(self IN OUT StringAggType2,
                           ctx2 IN StringAggType2)
        return number
);
/
create or replace type body StringAggType2
 is
 static function ODCIAggregateInitialize(
   sctx IN OUT StringAggType2)
 return number
 is
 begin
     sctx := StringAggType2( null,0 );
     return ODCIConst.Success;
 end;
 member function ODCIAggregateIterate(
   self IN OUT StringAggType2 ,
   value IN StringOrder )
 return number
 is
 begin
   if value.TheOrder > self.MaxOrder then
     self.theString := self.theString ||
                       lpad(',',value.TheOrder-self.MaxOrder,',') ||
                       value.TheString ;
     self.MaxOrder := value.TheOrder ;
   elsif value.TheOrder < self.MaxOrder then
     self.theString := substr(self.theString,1,instr(self.theString,',',1,value.TheOrder))||
                       value.TheString ||
                       substr(self.theString,instr(self.theString,',',1,value.TheOrder)+1) ;
   end if ;
   return ODCIConst.Success;
 end;
 member function ODCIAggregateTerminate(
   self IN StringAggType2,
   returnValue OUT  varchar2,
   flags IN number)
 return number
 is
 begin
     returnValue := rtrim( ltrim( self.theString, ',' ), ',' );
     return ODCIConst.Success;
 end;
 member function ODCIAggregateMerge(
   self IN OUT StringAggType2,
   ctx2 IN StringAggType2)
 return number
 is
 begin
     self.theString := self.theString || ',' || ctx2.theString;
     self.MaxOrder := self.MaxOrder + ctx2.MaxOrder ;
     return ODCIConst.Success;
 end;
end ;
/
CREATE OR REPLACE FUNCTION StrAggOrd(input StringOrder)
RETURN varchar2
PARALLEL_ENABLE AGGREGATE USING StringAggType2;
/

Then we can use it by

select deptno, StrAggOrd(StringOrder(ename,rn)) member
from 
  (select deptno, ename, 
   row_number () over (partition by deptno order by ename) rn
   from emp)
group by deptno ;

How about this solution ? 


Followup   March 18, 2004 - 8am Central time zone:

what do you mean?  user defined aggregates are documented to be "analytic friendly"

Just take my original stragg unchanged:

ops$tkyte@ORA9IR2> l
  1  select deptno, stragg(ename) over (partition by deptno order by ename) ename
  2* from emp
ops$tkyte@ORA9IR2> /
 
    DEPTNO ENAME
---------- ----------------------------------------
        10 CLARK
        10 CLARK,KING
        10 CLARK,KING,MILLER
        20 ADAMS
        20 ADAMS,FORD
        20 ADAMS,FORD,JONES
        20 ADAMS,FORD,JONES,SCOTT
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN
        30 ALLEN,BLAKE
        30 ALLEN,BLAKE,JAMES
        30 ALLEN,BLAKE,JAMES,MARTIN
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 
14 rows selected.
 
 

5 stars exactly what I was looking for yesterday   March 18, 2004 - 10am Central time zone
Reviewer: bob from PA
Your holding out on me Tom. :)

Yesterday when I asked:

"Using StringAgg, how can I order the concatenation so that it occurs alphabetically? "

You gave me:
 
"You'd have to "save up" the inputs and then sort them and then concatenate them. 
 they come in random order to stragg... "

Today, you give Philip the key with the use of analytics.
I guess I just didn't know how to ask my question correctly, but thankfully philp was trying to do 
the same exact thing I was.

This site is great!
 


Followup   March 18, 2004 - 10am Central time zone:

but analytics don't really do that -- for if you look at the results -- you didn't get just an 
alphabetic listing of enames by deptno, you got a record per employee!

So, in response to:

<quote>
It will be a good idea if we can create a user defined analytic function in the 
same way we did with a user defined aggregate function. (Oracle should consider 
its possibility)
</quote>

the answer is "well, we do that".  But if you want stragg to work as an AGGREGATE with sorted data 
-- the answer is "aggregates don't do that"


Ok, here is one that does "sorted" aggregates:


ops$tkyte@ORA9IR2> create or replace type vcArray as table of varchar2(4000)
  2  /
Type created.


ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace type string_agg_type as object
  2  (
  3     data  vcArray,
  4
  5     static function
  6          ODCIAggregateInitialize(sctx IN OUT string_agg_type )
  7          return number,
  8
  9     member function
 10          ODCIAggregateIterate(self IN OUT string_agg_type ,
 11                               value IN varchar2 )
 12          return number,
 13
 14     member function
 15          ODCIAggregateTerminate(self IN string_agg_type,
 16                                 returnValue OUT  varchar2,
 17                                 flags IN number)
 18          return number,
 19
 20     member function
 21          ODCIAggregateMerge(self IN OUT string_agg_type,
 22                             ctx2 IN string_agg_type)
 23          return number
 24  );
 25  /
 
Type created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace type body string_agg_type
  2  is
  3
  4  static function ODCIAggregateInitialize(sctx IN OUT string_agg_type)
  5  return number
  6  is
  7  begin
  8      sctx := string_agg_type( vcArray() );
  9      return ODCIConst.Success;
 10  end;
 11
 12  member function ODCIAggregateIterate(self IN OUT string_agg_type,
 13                                       value IN varchar2 )
 14  return number
 15  is
 16  begin
 17      data.extend;
 18      data(data.count) := value;
 19      return ODCIConst.Success;
 20  end;
 21
 22  member function ODCIAggregateTerminate(self IN string_agg_type,
 23                                         returnValue OUT varchar2,
 24                                         flags IN number)
 25  return number
 26  is
 27      l_data varchar2(4000);
 28  begin
 29      for x in ( select column_value from TABLE(data) order by 1 )
 30      loop
 31              l_data := l_data || ',' || x.column_value;
 32      end loop;
 33      returnValue := ltrim(l_data,',');
 34      return ODCIConst.Success;
 35  end;
 36
 37  member function ODCIAggregateMerge(self IN OUT string_agg_type,
 38                                     ctx2 IN string_agg_type)
 39  return number
 40  is
 41  begin -- not really tested ;)
 42      for i in 1 .. ctx2.data.count
 43      loop
 44              data.extend;
 45              data(data.count) := ctx2.data(i);
 46      end loop;
 47      return ODCIConst.Success;
 48  end;
 49
 50
 51  end;
 52  /
 
Type body created.
 
ops$tkyte@ORA9IR2> show err
No errors.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> CREATE or replace
  2  FUNCTION stragg(input varchar2 )
  3  RETURN varchar2
  4  PARALLEL_ENABLE AGGREGATE USING string_agg_type;
  5  /
 
Function created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> column ename format a40
ops$tkyte@ORA9IR2> select deptno, stragg(ename) ename
  2    from emp
  3   group by deptno
  4  /
 
    DEPTNO ENAME
---------- ----------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 


You could take the above other example and apply the same technique (the row_number() concept is 
intriguing as it allows you to specify some other sorted order) 

5 stars same result   March 18, 2004 - 12pm Central time zone
Reviewer: bob from PA
Hi Tom,

I had just wrapped that earlier result (one row per employee) in another analytic to grab the 
result from the last employee that contained the full list for the dept, so ultimately, the initial 
use of analytics which returned a result for each employee led me to use another one, to get what I 
was looking for without rewriting the user defined aggregate like you did.

I am sure this is far from ideal, but it worked initially. 


5 stars Thank you   March 18, 2004 - 11pm Central time zone
Reviewer: Nopparat V. from Thailand
I'm sorry that I didn't finish the whole chapter of User-Defined Aggregate Functions really. You're 
right. It all was said in the book.

Also I agree with Bob. If we can use original StringAgg with some analytic options in order to 
complete our requirement, it should be the better choice. I think the original one can be used in 
more general purpose than the other modified ones. 


Followup   March 19, 2004 - 8am Central time zone:

but analytics are fundementally *not* aggregates.

Think about how many times the analytic function is invoked vs how many times the aggregate is..... 

5 stars More Questions   March 19, 2004 - 12am Central time zone
Reviewer: Nopparat V. from Thailand
I have found from the book that there is another interface ODCIAggregateDelete to use in case of 
analytic function. I, however, can't understand it well. Could you explain me more how it work? or 
Just show me some example to implement.

Also please advise whether I should add this interface into the original StrAggType. How it affect 
the performance when we use it in the simple aggregate action? 


Followup   March 19, 2004 - 8am Central time zone:

it is an optimization (and a good question -- at first I was going to leave this as an exercise for 
the reader, but I got curious myself and proved that "every day you can learn something new about 
Oracle", this was my new thing today :)

http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96595/dci11agg.htm#1004794
http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96595/dci19agg.htm#76648
It would be most useful with sliding windows, or non "top anchored" windows.

In the above example -- the stragg result was always "growing" or just "reset".  it would never 
delete.  If I had a SLIDING window however -- it would be different.  Consider a sliding window of 
the "preceding, current and next" row.  stragg would be gluing three rows together and slide 
through the result set.  Well, there are two ways to do this -- with and without the delete.

In order to see the difference, we'll create a small package that is useful for collecting your own 
custom statistics in your session:

ops$tkyte@ORA9IR2> create or replace context stats_ctx using stats_pkg
  2  /
 
Context created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace package stats_pkg
  2  as
  3      procedure reset;
  4      procedure inc( p_name in varchar2, p_amt in number default 1 );
  5      procedure show( p_reset in boolean default false );
  6  end;
  7  /
 
Package created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace package body stats_pkg
  2  as
  3  procedure reset
  4  is
  5  begin
  6      for x in ( select attribute
  7                   from session_context
  8                  where namespace = 'STATS_CTX' )
  9      loop
 10          dbms_session.set_context( 'stats_ctx', x.attribute, 0 );
 11      end loop;
 12  end;
 13
 14
 15  procedure inc( p_name in varchar2, p_amt in number default 1 )
 16  is
 17  begin
 18      dbms_session.set_context( 'stats_ctx', p_name, 
              nvl(sys_context('stats_ctx',p_name),0)+p_amt );
 19  end;
 20
 21  procedure show( p_reset in boolean default false )
 22  is
 23  begin
 24      for x in ( select attribute, value
 25                   from session_context
 26                  where namespace = 'STATS_CTX'
 27                  order by attribute )
 28      loop
 29          dbms_output.put_line( rpad( x.attribute, 32 ) || x.value );
 30          if ( p_reset )
 31          then
 32              dbms_session.set_context( 'stats_ctx', x.attribute, 0 );
 33          end if;
 34      end loop;
 35  end;
 36
 37  end;
 38  /
 
Package body created.


Now, I instrumented the original stragg with:

...
  4  static function ODCIAggregateInitialize(sctx IN OUT string_agg_type)
  5  return number
  6  is
  7  begin
  8      sctx := string_agg_type( null );
  9      stats_pkg.inc( 'ODCIAggregateInitialize' );
 10      return ODCIConst.Success;
 11  end;
......

in each and every entry point.  Now, we run some queries:


ops$tkyte@ORA9IR2> column enames format a50
ops$tkyte@ORA9IR2> select deptno, stragg(ename) over (order by ename rows between 1 preceding and 1 
following ) enames
  2    from emp
  3  /
 
    DEPTNO ENAMES
---------- --------------------------------------------------
        20 ADAMS,ALLEN
        30 ADAMS,ALLEN,BLAKE
        30 ALLEN,BLAKE,CLARK
        10 BLAKE,CLARK,FORD
...
        30 SMITH,TURNER,WARD
        30 TURNER,WARD
 
14 rows selected.
 
ops$tkyte@ORA9IR2> exec stats_pkg.show( true );
ODCIAGGREGATEINITIALIZE         13
ODCIAGGREGATEITERATE            38
ODCIAGGREGATETERMINATE          15
 
PL/SQL procedure successfully completed.

as you can see -- that iterate routine was called LOTS as we slid through.  that is because each 
three row "window" was constructed one by one.  Now, look at a "growing only" example:


ops$tkyte@ORA9IR2> select deptno, stragg(ename) over (partition by deptno order by ename) enames
  2    from emp
  3  /
 
    DEPTNO ENAMES
---------- --------------------------------------------------
        10 CLARK
        10 CLARK,KING
        10 CLARK,KING,MILLER
        20 ADAMS
        20 ADAMS,FORD
...
        30 ALLEN,BLAKE,JAMES,MARTIN
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 
14 rows selected.
 
ops$tkyte@ORA9IR2> exec stats_pkg.show( true );
ODCIAGGREGATEINITIALIZE         3
ODCIAGGREGATEITERATE            14
ODCIAGGREGATETERMINATE          15
 
PL/SQL procedure successfully completed.

iterate was called lots less (as was initialize -- once per partition)

Lets add this to the spec/body:

ops$tkyte@ORA9IR2> create or replace type string_agg_type as object
  2  (
....
 20     member function
 21          ODCIAggregateDelete(self IN OUT string_agg_type,
 22                                 value IN varchar2 )
 23          return number,
...

ops$tkyte@ORA9IR2> create or replace type body string_agg_type
  2  is
  3
...
 23  member function ODCIAggregateDelete(self IN OUT string_agg_type,
 24                                      value IN varchar2 )
 25  return number
 26  is
 27  begin
 28      self.total := replace( self.total, ',' || value, '' );
 29      stats_pkg.inc( 'ODCIAggregateDelete' );
 30      return ODCIConst.Success;
 31  end;
......

NOTE: warning Will Robinson -- danger danger....  I took a totally cheesy cheat here.  A simple 
replace.  This only works ONLY works because ename is UNIQUE (and contains no commas!).  Our code 
would have to obviously be more sophisticated for this to really work in general!!!!!

Now lets rerun that query:


ops$tkyte@ORA9IR2> select deptno, stragg(ename) over (order by ename rows between 1 preceding and 1 
following ) enames
  2    from emp
  3  /
 
    DEPTNO ENAMES
---------- --------------------------------------------------
        20 ADAMS,ALLEN
        30 ADAMS,ALLEN,BLAKE
        30 ALLEN,BLAKE,CLARK
        10 BLAKE,CLARK,FORD
...
        20 SCOTT,SMITH,TURNER
        30 SMITH,TURNER,WARD
        30 TURNER,WARD
 
14 rows selected.
 
ops$tkyte@ORA9IR2> exec stats_pkg.show( true );
ODCIAGGREGATEDELETE             12
ODCIAGGREGATEINITIALIZE         1
ODCIAGGREGATEITERATE            14
ODCIAGGREGATETERMINATE          15


42 calls, vs 66 calls before -- and if the delete was "efficient", perhaps significantly reduced 
resource usage.

Note however that a growing example:


ops$tkyte@ORA9IR2> select deptno, stragg(ename) over (partition by deptno order by ename) enames
  2    from emp
  3  /
 
    DEPTNO ENAMES
---------- --------------------------------------------------
....
        30 ALLEN,BLAKE,JAMES
        30 ALLEN,BLAKE,JAMES,MARTIN
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 
14 rows selected.
 
ops$tkyte@ORA9IR2> exec stats_pkg.show( true );
ODCIAGGREGATEDELETE             0
ODCIAGGREGATEINITIALIZE         3
ODCIAGGREGATEITERATE            14
ODCIAGGREGATETERMINATE          15
 
PL/SQL procedure successfully completed.

won't be affected...
 

5 stars Thank you   March 20, 2004 - 3am Central time zone
Reviewer: Nopparat V. from Thailand
Thank you very much. I'll try its effects on normal aggregrate by myself. But I think it probably 
result in the same manner as the last one. Then it mean we can add, and probably should add ,the 
delete function if we have any efficient one with no negative effects.

I strongly agree with you that "everyday you can learn something new about Oracle", especially from 
this website.  


5 stars Another Curiousity   March 20, 2004 - 4am Central time zone
Reviewer: Nopparat V. from Thailand
I expect the usage count of OCIAggregateTerminate of all sample to be 14 (equal to 14 rows of 
result). Why it become 15? When was it invoked for the additional one? 


Followup   March 20, 2004 - 10am Central time zone:

you fetch 15 times for 14 rows, guess they are calling it regardless.   

5 stars   May 26, 2004 - 4am Central time zone
Reviewer: A reader 


5 stars stragg ... sometimes   July 2, 2004 - 1pm Central time zone
Reviewer: New reader from Wilmington, DE USA
STRAGG is great.  I use it all the time.  However I have to average a set of results only when all 
of the results in the set have a null operator.  If any of the results have a value for the 
operator (<,>,etc) then all of the results should be STRAGG'd.  As you can see below, because the 
set where assay_id = 2 has both null and non-null operators, I'm getting both the average and the 
STRAGG returned.  Any suggestions to remove unwanted row?

CREATE TABLE t (
    alt_id VARCHAR2(30),
    lot_id VARCHAR2(30),
    OPERATOR VARCHAR2(5),
    result NUMBER,
    assay_id NUMBER,
    result_type NUMBER);

INSERT INTO t SELECT 'CorpID 1','1',NULL,10,1,10 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1',NULL,14,1,10 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1',NULL,15,2,4 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1',NULL,15,2,4 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1','>',50,2,4 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1','<',8,3,7 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1','<',10,3,7 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1','<',50,4,12 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1','>',35,4,12 FROM dual;


CREATE OR REPLACE FUNCTION stragg (
    pCol IN VARCHAR2,
    pTbl IN VARCHAR2,
    pWhere IN LONG,
    pOrder IN VARCHAR2,
    pSeparator IN VARCHAR2 DEFAULT ', ') RETURN VARCHAR2
AS
    TYPE cur IS REF CURSOR;
    myCur cur;
    sAggValue VARCHAR2(4000);
    sSep VARCHAR2(10);
    sAggregate VARCHAR2(4000);
BEGIN
    OPEN myCur FOR
    'SELECT ' || pCol || ' ' || 'FROM ' || pTbl || ' ' || pWhere || ' ' || pOrder;

    LOOP
    FETCH myCur into sAggValue;
        EXIT WHEN myCur%NOTFOUND;

        sAggregate := sAggregate || sSep || sAggValue;
        sSep := pSeparator;
    END LOOP;
    CLOSE myCur;

    RETURN sAggregate;
END stragg;


SELECT DISTINCT alt_id,
    lot_id,
    CASE WHEN operator IS NULL THEN To_Char(Avg(result))
         ELSE stragg('operator||result', 't', 'WHERE alt_id=''' || alt_id || ''' and lot_id=''' || 
lot_id || ''' and assay_id=' || assay_id || ' and result_type=' || result_type, 'ORDER BY result')
    END AS result,
    assay_id,
    result_type
FROM t
GROUP BY alt_id, lot_id, operator, assay_id, result_type
ORDER BY alt_id, lot_id, assay_id, result_type;


alt_id        lot_id    result        assay_id    result_type
--------    ------    ------        --------    -----------
CorpID 1    1    12        1        10
CorpID 1    1    15        2        4 *unwanted
CorpID 1    1    15,15,>50    2        4
CorpID 1    1    <8,<10        3        7
CorpID 1    1    >35,<50        4        12 


Followup   July 2, 2004 - 2pm Central time zone:

don't get it. 

5 stars stragg ... sometimes   July 2, 2004 - 2pm Central time zone
Reviewer: New Reader from Wilmington, DE
The following statement is returning the average of the 2 results of 15 (assay_id=2) and the 
STRAGG'd list of results for all assay_id=2 (15,15,>50).  For any group of results where one or 
more of the operators is not null (i.e. <,>,etc where assay_id=2 in this case), it should return 
just the STRAGG, not the average.  So I should not get the second row because the two results of 15 
are being reported in the third row.

SELECT DISTINCT alt_id,
    lot_id,
    CASE WHEN operator IS NULL THEN To_Char(Avg(result))
         ELSE stragg('operator||result', 't', 'WHERE alt_id=''' || alt_id || ''' and lot_id=''' || 
lot_id || ''' and assay_id=' || assay_id || ' and result_type=' || result_type, 'ORDER BY result')
    END AS result,
    assay_id,
    result_type
FROM t
GROUP BY alt_id, lot_id, operator, assay_id, result_type
ORDER BY alt_id, lot_id, assay_id, result_type;


alt_id        lot_id    result        assay_id    result_type
--------    ------    ------        --------    -----------
CorpID 1    1    12        1        10
CorpID 1    1    15        2        4 *unwanted
CorpID 1    1    15,15,>50    2        4
CorpID 1    1    <8,<10        3        7
CorpID 1    1    >35,<50        4        12 


Followup   July 2, 2004 - 4pm Central time zone:

so, you want to process rows such that either

a) all of the operators are NULL in a given "group" - alt_id/lot_id being the definition of a group

b) only do the rows in a group where the operations are not null, if there is at least one not null 
operator

Use this in place of "T" in your query:


(
select * 
  from (
select t.*, max(operator) over (partition by alt_id, lot_id) max_oper
  from t
       )
 where max_oper is null OR operator is not null
)


that'll find if there is a non-null operator in your group -- and keep only rows  you want. 

4 stars Packaged function   August 19, 2004 - 2pm Central time zone
Reviewer: A reader 
Can stragg be a packaged function instead of a standalone function? 

Seems not, why is this?

Thanks 


Followup   August 19, 2004 - 7pm Central time zone:

you need a standalone function, it is just what it is.

ops$tkyte@ORA9IR2> CREATE or replace
  2  FUNCTION stragg(input varchar2 )
  3  RETURN varchar2
  4  PARALLEL_ENABLE AGGREGATE USING string_agg_type;
  5  /

stragg is *just a name*, it is mapped to a type - the type is just like a package.

No additional functionality could be gained by a package here. 

5 stars User-defined aggregates vs. builtin aggregates   August 27, 2004 - 9am Central time zone
Reviewer: A reader 
Just curious...if I write my own version of some of the builtin aggregates like sum, min, max, etc, 
would that perform the same as builtins?

In other words, are user-defined aggregates second-class citizens or does Oracle internally apply 
the same optimizations it does to both?

Thanks 


Followup   August 27, 2004 - 10am Central time zone:

builtin functions will almost always be more performant than user written code.

nothing about being a "second class citizen"

everything about "compiled C code generally runs faster than pcode from plsql" 

5 stars User-defined aggregates vs. builtin aggregates   August 27, 2004 - 10am Central time zone
Reviewer: A reader 
"everything about compiled C code generally runs faster than pcode from plsql" 

So would native PL/SQL compilation help here? What if I natively compile my user-defined aggragate 
function, that would create native compiled C code (dll or .so whatever) and then they should 
perform the same, right? 


Followup   August 27, 2004 - 10am Central time zone:

native compilation can help in limited circumstances.  you still go through the plsql runtime 
engine. 

3 stars collect   September 13, 2004 - 6am Central time zone
Reviewer: Laurent Schneider from Switzerland
pseudo string aggregate in 10g with COLLECT


SQL> create or replace type tov as table of varchar2(255);
/

Type created.

SQL> select deptno, cast(collect(ename) as tov) STRAGG
from emp group by deptno;

    DEPTNO STRAGG
---------- 
---------------------------------------------------------------------------------------------------
        10 TOV('CLARK', 'KING', 'MILLER')
        20 TOV('SMITH', 'JONES', 'SCOTT', 'ADAMS', 'FORD')
        30 TOV('ALLEN', 'WARD', 'MARTIN', 'BLAKE', 'TURNER', 'JAMES')



Regards
Laurent
 


3 stars Locate STRAGG?   November 30, 2004 - 8am Central time zone
Reviewer: A reader from B
Earlier, you mention 
>> Just take my original stragg unchanged:

I'm trying to locate the code, but cannot find it.
Coul you add a link on this page? 
http://asktom.oracle.com/pls/ask/f?p=4950:8:12690073736180799678::NO::F4950_P8_DISPLAYID,F4950_P8_CR
ITERIA:15637744429336,
 


Followup   November 30, 2004 - 9am Central time zone:

when i plugged

stragg

into the search, the very first hit had it?


http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:2196162600402

5 stars Sorted aggregates   June 13, 2005 - 11am Central time zone
Reviewer: A reader 
This thread has aged a little...

In 9iR2, whats the latest and greatest way to get sorted aggregates using stragg? Tom presented a 
solution, some other reviewer had a suggestion, which one is the one to use?

Ideally, it should be as simple as adding another parameter to stragg i.e. 
stragg(<expression>,'asc|desc'). Why cant it that simple?

Thanks 


Followup   June 13, 2005 - 12pm Central time zone:

because aggregates are sets, the concept of "sorting" does not apply to them at all.  


stragg with analytics works
or
stashing the data in your own state and stringing it up at the end works


ops$tkyte@ORA9IR2> select *
  2    from (
  3  select deptno,
  4         case when row_number() over (partition by deptno order by ename)=1
  5             then stragg(ename) over
  6                  (partition by deptno
  7                       order by ename
  8                         rows between unbounded preceding
  9                                  and unbounded following)
 10         end enames
 11    from emp
 12         )
 13   where enames is not null
 14  /
 
    DEPTNO ENAMES
---------- ------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 

5 stars Re: Sorted aggregates   June 14, 2005 - 8am Central time zone
Reviewer: Laurent Schneider from Switzerland
For Ascending order, you maybe can use "distinct" to simulate the sort.

select deptno, stragg(distinct ename) ename
from emp group by deptno;

    DEPTNO ENAME
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD




for desc order, you could use analytics

select distinct deptno, stragg(ename) over (partition by deptno order by ename desc rows between 
unbounded preceding and unbounded following) ename
from emp;

    DEPTNO ENAME
---------- --------------------------------------------------
        10 MILLER,KING,CLARK
        20 SMITH,SCOTT,JONES,FORD,ADAMS
        30 WARD,TURNER,MARTIN,JAMES,BLAKE,ALLEN

or a workaround like 

SQL> select deptno, translate(stragg(distinct translate(ename, 'ABCDEFGHIJKLMNOQRSTUVWXYZ', 
'ZYXWVUTSRQONMLKJIHGFEDCBA')), 'ABCDEFGHIJKLMNOQRSTUVWXYZ', 'ZYXWVUTSRQONMLKJIHGFEDCBA') ename  
from emp group by deptno;

    DEPTNO ENAME
---------- --------------------------------------------------
        10 MILLER,KING,CLARK
        20 SMITH,SCOTT,JONES,FORD,ADAMS
        30 WARD,TURNER,MARTIN,JAMES,BLAKE,ALLEN


I understand "a reader" confusion, because most aggregate functions use commutative properties like 
+ and *.

Ex: SUM(1,2,3) = SUM(3,2,1)
AVG(1,10,100) = AVG(100, 10, 1)

The stragg UDAG is not deterministic. It is probably best not to use it in a MView or in a Function 
Based Index.

Kind regards
 


Followup   June 14, 2005 - 10am Central time zone:

don't -- that is like assuming GROUP BY sorts, it might, it might not.  distinct might sort, it 
might not. 

5 stars RE: that is like assuming GROUP BY sorts   June 20, 2005 - 8am Central time zone
Reviewer: Laurent Schneider from switzerland
Thanks for the info. It is fairly surprising! Nice to know !

Here is a proof that group by does not necessarly sort the way I expected :

SQL> create index desc_job_emp on emp(job desc); 

Index created.

SQL> select job from emp where job < 'Q' group by job;
JOB
---------
PRESIDENT
MANAGER
CLERK
ANALYST

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 
   1    0   SORT (GROUP BY NOSORT) (Cost=2 Card=10 
   2    1     INDEX (RANGE SCAN) OF 'DESC_JOB_EMP' (INDEX)

It is also stated in the doc : 
"The GROUP BY clause groups rows but does not guarantee the order of the result set. To order the 
groupings, use the ORDER BY clause."

About the distinct, I did not succeed until yet in finding a query that does not sort per 
col1,col2,col3... The doc do not specify whether it should sort or not, so thanks for pointing that 
it is not safe to assume it will ! 


Followup   June 20, 2005 - 10am Central time zone:

the only way to get sorted data....
   use order by. 

4 stars   November 2, 2005 - 6pm Central time zone
Reviewer: siva prakash from Chicago, IL USA
Thanks Kyte . 

I learned something new today . 

can we recreate this procedure with LOB datatype.
 


Followup   November 3, 2005 - 6am Central time zone:

http://asktom.oracle.com/pls/ask/search?p_string=stragg+clob

5 stars   November 4, 2005 - 1am Central time zone
Reviewer: A reader 
Hi Tom,

I modified the stragg for clob datatype. It works perfectly fine for small data. But for large data 
sets, it takes forever e.g

select employer_code,stragg(cust_name) 
from Customer where rownum < 1001
group by employer_code 

Customer table has 86,000 rows. 

It returns results in few msecs. But when I remove the where clause, it takes close to 2hrs. At 
times it return with error "unable to extend TEMP tablespace"

I am using 9.2.0.6 dedicated server, with workarea_size_policy=manual, sort_area_size=64MB.

Is there any way, i can optimize it?
 


Followup   November 4, 2005 - 3am Central time zone:

build up strings of 32k of data and put them into a clob when they get close to 32k (buffer in a 
string, append big strings to the clob - instead of tons of small strings) 

5 stars strAgg   November 9, 2005 - 11am Central time zone
Reviewer: Phil from UK
Hi Tom
Before we create our own Aggregate function, can you let us know if there is an easier way to do 
this?
We have a table with a number datatype and we simply want to do a product(column) rather than a 
sum(column). Is this easier than we think? 


5 stars Packaged version of stragg   January 4, 2006 - 4pm Central time zone
Reviewer: Stewart W. Bryson from Atlanta, GA
Tom Kyte said:
"No additional functionality could be gained by a package here."

I wrote a package of ETL mappings that use STRAGG to generate MERGE commands from the dictionary. 
However, I'd like for this package to contain the STRAGG functionality for portability reasons. 
Even though there is no additional functionality gained from packaging STRAGG, could it be done 
anyway for portability?

Thanks.
 


Followup   January 5, 2006 - 9am Central time zone:

how would this enhance "portability" in any way?

as shown above, what you see is what you get - you need that function stub to expose it, you need 
the type to implement it. 

5 stars what I meant by portability   January 6, 2006 - 11am Central time zone
Reviewer: Stewart W. Bryson from Atlanta, GA
This package will be rolled out to countless databases by other DBA's managing those systems. 
Management would like all the functionality inside one package.

If it can't be done... it can't be done. Just checking.

Thanks. 


5 stars A pure SQL solution for the original "STRAGG" question   January 16, 2006 - 1pm Central time zone
Reviewer: Frank Zhou from Braintree, MA
Hi Tom,
     Here is a pure SQL solution for the original "STRAGG" question,
user defined aggregate function "STRAGG" is not required.

Thanks,

Frank
     

SQL> SELECT * FROM STAGE;

CU PR OR OPERATOR                                                                                   
                    
-- -- -- ---------                                                                                  
                    
C1 P1 D1 OP1                                                                                        
                    
C1 P2 D2 OP2                                                                                        
                    
C1 P3 D3 OP3                                                                                        
                    
C1 P1 D4 OP3                                                                                        
                    
C2 P2 D1 OP2                                                                                        
                    
C2 P3 D2 OP4                                                                                        
                    
C3 P1 D3 OP1                                                                                        
                    
C3 P1 D2 OP1                                                                                        
                    
8 rows selected.

SQL> SELECT customer,
  2  MAX(LTRIM( sys_connect_by_path( product, ',' ) ,  ','))  prod_str,
  3  MAX(LTRIM( sys_connect_by_path( order_date , ',' ) ,  ','))  order_str,
  4  MAX(LTRIM( sys_connect_by_path( OPERATOR  , ',' ) ,  ','))  OPERATOR_str
  5  FROM
  6  (
  7    SELECT customer, product  , order_date , OPERATOR,
  8           row_number() over (PARTITION BY customer ORDER BY ROWID) rn
  9    FROM STAGE
 10  )
 11  START WITH rn = 1
 12  CONNECT BY customer = PRIOR customer AND  PRIOR  rn  = rn -1
 13  GROUP BY customer;

CU                                                                                                  
                    
--                                                                                                  
                    
PROD_STR                                                                                            
                    
----------------------------------------------------------------------------------------------------
--------------------
ORDER_STR                                                                                           
                    
----------------------------------------------------------------------------------------------------
--------------------
OPERATOR_STR                                                                                        
                    
----------------------------------------------------------------------------------------------------
--------------------
C1                                                                                                  
                    
P1,P2,P3,P1                                                                                         
                    
D1,D2,D3,D4                                                                                         
                    
OP1,OP2,OP3,OP3                                                                                     
                    
                                                                                                    
                    

CU                                                                                                  
                    
--                                                                                                  
                    
PROD_STR                                                                                            
                    
----------------------------------------------------------------------------------------------------
--------------------
ORDER_STR                                                                                           
                    
----------------------------------------------------------------------------------------------------
--------------------
OPERATOR_STR                                                                                        
                    
----------------------------------------------------------------------------------------------------
--------------------
C2                                                                                                  
                    
P2,P3                                                                                               
                    
D1,D2                                                                                               
                    
OP2,OP4                                                                                             
                    
                                                                                                    
                    

CU                                                                                                  
                    
--                                                                                                  
                    
PROD_STR                                                                                            
                    
----------------------------------------------------------------------------------------------------
--------------------
ORDER_STR                                                                                           
                    
----------------------------------------------------------------------------------------------------
--------------------
OPERATOR_STR                                                                                        
                    
----------------------------------------------------------------------------------------------------
--------------------
C3                                                                                                  
                    
P1,P1                                                                                               
                    
D3,D2                                                                                               
                    
OP1,OP1                                                                                             
                    
                                                                                                    
                    

SQL> spool off
 


3 stars Thanks   March 29, 2006 - 1am Central time zone
Reviewer: Guna Srinivas 
Hi Tom... 
Really i appriciate your work... I learn new things everyday. This gives very confident to speak 
other tech. people.. 

Thanks for your valuable work...

 


4 stars How do i use stragg in oracle 8i   March 29, 2006 - 1am Central time zone
Reviewer: Guna Srinivas 
Hi Tom,
   I am using oracle 8i (Client not ready to move 9i now). So that i could not execute stragg 
function. Is there anyway can i get the same out for the initial question...
   
Thanks in advance...

 


Followup   March 29, 2006 - 7am Central time zone:

search site for pivot 

5 stars Reg: A pure SQL solution for the original "STRAGG" question   July 26, 2006 - 3pm Central time zone
Reviewer: ST from MI USA
From the above review dated January 16th, 2006 by Frank Zhou  from Braintree, MA, Is there a way we 
can eliminate any duplicates from the list that is being concatenated. Like in the first row we see 
P1,P2,P3,P1. Can we have only P1,P2,P3 or P1+P2+P3 in the result set? Thanks for your help. 


Followup   July 26, 2006 - 4pm Central time zone:

add some distincting in there.  just get the distinct things you want to connect by first. 

4 stars Reg: A pure SQL solution for the original "STRAGG" question   July 26, 2006 - 4pm Central time zone
Reviewer: ST 
Tried the distinct in the inline view but with the same result as it gets the row numbers as 
distinct. Still getting the same result (P1,P2,P3,P1). 


Followup   July 26, 2006 - 4pm Central time zone:

if you just want the products, drop the other columns from the select before the distinct (the 
order date and stuff are not duplicated) 

5 stars Reg: A pure SQL solution for the original "STRAGG" question   July 26, 2006 - 4pm Central time zone
Reviewer: ST 
No luck even after removing the additional fields. The problem that we are having is because if it 
gets the rn, then it thinks it is a distinct for that record in the inline view and hence the 
result is still showing P1 twice with P1,P2,P3,P1 where as I am looking for a result of P1,P2,P3. 
Thanks for your quick response. 


Followup   July 26, 2006 - 5pm Central time zone:

i'm saying to distinct before that.


ops$tkyte%ORA10GR2> SELECT customer,
  2  MAX(LTRIM( sys_connect_by_path( product, ',' ) ,  ','))  prod_str
  3  FROM
  4  (
  5    SELECT customer, product,
  6           row_number() over (PARTITION BY customer ORDER BY product) rn
  7    FROM (select distinct customer, product from STAGE)
  8  )
  9  START WITH rn = 1
 10  CONNECT BY customer = PRIOR customer AND  PRIOR  rn  = rn -1
 11  GROUP BY customer;

CU PROD_STR
-- ----------------------------------------
C2 P2,P3
C1 P1,P2,P3
C3 P1
 

5 stars Reg: A pure SQL solution for the original "STRAGG" question   July 26, 2006 - 5pm Central time zone
Reviewer: ST from MI USA
Here is the query as per your suggestion except that I am having two tables in the sub query:

select r2.STDID, r2.term,
       max(LTRIM( sys_connect_by_path(r2.location, ' ' ), ' ')) location
  from ( select STDID, term, location,
                row_number() over (partition by STDID order by STDID )  rn
           from (select distinct reg.STDID, sec.term, sec.location 
                    from abc.sections sec, abc.registration reg
                    where sec.SECTIONID=reg.SECTIONID 
                 and sec.TERM=reg.TERM and sec.TERM='2001/2' )) r2
  START WITH rn = 1
  CONNECT BY r2.STDID = PRIOR r2.STDID AND  PRIOR  rn  = rn -1
 group by r2.VSTDID,r2.vterm;


Here is the result that I am getting where I am looking for 'MC' instead of multiple MC's


STDID|TERM|LOCATION
00001|2001/2|MC MC MC
00002|2001/2|MC
00003|2001/2|DL MC SLB
00004|2001/2|MC MC MC MC
00005|2001/2|MC MC MC MC MC MC MC MC MC
00006|2001/2|MC MC MC MC
00007|2001/2|MC MC MC MC
00008|2001/2|MC MC
00009|2001/2|MC MC MC MC
00010|2001/2|MC
00011|2001/2|MC MC DL
00012|2001/2|MC OEO

Any idea what should be done if there are two tables ? 


Followup   July 26, 2006 - 5pm Central time zone:

create tables
insert intos

this doesn't look right to me, should work.   

5 stars   July 26, 2006 - 6pm Central time zone
Reviewer: ST from MI USA
The tables that are here with the required columns and are short listed as there are lot of columns 
from the original table.

CREATE TABLE REGISTRATION
(
  TERM        VARCHAR2(7),
  STDID       VARCHAR2(10),
  SECTIONID   VARCHAR2(7)
);

CREATE TABLE SECTIONS
(
  TERM             VARCHAR2(7),
  SECTIONID        VARCHAR2(21),
  LOCATION         VARCHAR2(10)
);


The data is comming from the these two tables and provided the results from the original tables. 
Hope this helps. The results provided are actual results. FYI 


Followup   July 26, 2006 - 6pm Central time zone:

I sort of need some inserts?

looking for that "test case that reproduces the problem" 

5 stars Reg: A pure SQL solution for the original "STRAGG" question   July 26, 2006 - 6pm Central time zone
Reviewer: ST from MI USA
OK here are some sample data inserts:

insert into sections values('2001/2','17459','MC');
insert into sections values('2001/2','18350','MC');
insert into sections values('2001/2','18227','MC');
insert into sections values('2001/2','17186','DL');
insert into sections values('2001/2','17796','MC');
insert into sections values('2001/2','17621','MC');
.......

insert into registration values('2001/2','00002','17459');
insert into registration values('2001/2','00002','18350');
insert into registration values('2001/2','00002','18227');
insert into registration values('2001/2','00181','17186');
insert into registration values('2001/2','00181','17796');
insert into registration values('2001/2','00181','17621');
.... 


Followup   July 26, 2006 - 6pm Central time zone:

that'll be a bug and you now have the perfect test case for support when you add these two queries:

ops$tkyte%ORA10GR2> select STDID, term, location,
  2                  row_number() over (partition by STDID order by STDID )  rn
  3             from (select distinct reg.STDID, sec.term, sec.location
  4                      from sections sec, registration reg
  5                      where sec.SECTIONID=reg.SECTIONID
  6                   and sec.TERM=reg.TERM and sec.TERM='2001/2' )
  7  /

STDID      TERM    LOCATION           RN
---------- ------- ---------- ----------
00002      2001/2  MC                  1
00181      2001/2  DL                  1
00181      2001/2  MC                  2

ops$tkyte%ORA10GR2> select rpad('*',2*level,'*') || r2.location loc, rn,
  2         r2.STDID, r2.term,
  3         (LTRIM( sys_connect_by_path(r2.location, ' ' ), ' ')) location
  4    from ( select STDID, term, location,
  5                  row_number() over (partition by STDID order by STDID )  rn
  6             from (select distinct reg.STDID, sec.term, sec.location
  7                      from sections sec, registration reg
  8                      where sec.SECTIdiONID=reg.SECTIONID
  9                   and sec.TERM=reg.TERM and sec.TERM='2001/2' )) r2
 10    START WITH rn = 1
 11    CONNECT BY r2.STDID = PRIOR r2.STDID AND  PRIOR  rn  = rn -1
 12  /

LOC                     RN STDID      TERM    LOCATION
--------------- ---------- ---------- ------- ----------
**MC                     1 00002      2001/2  MC
**DL                     1 00181      2001/2  DL
****DL                   2 00181      2001/2  DL DL
 
clearly shows the location is getting munged some how

the MC got turned into DL!

Here is the workaround (sort by something other than stdid)


ops$tkyte%ORA10GR2> select r2.STDID, r2.term,
  2         max(LTRIM( sys_connect_by_path(r2.location, ' ' ), ' ')) location
  3    from ( select STDID, term, location,
  4                  row_number() over (partition by STDID order by location )  rn
  5             from (select distinct reg.STDID, sec.term, sec.location
  6                      from sections sec, registration reg
  7                      where sec.SECTIONID=reg.SECTIONID
  8                   and sec.TERM=reg.TERM and sec.TERM='2001/2' )) r2
  9    START WITH rn = 1
 10    CONNECT BY r2.STDID = PRIOR r2.STDID AND  PRIOR  rn  = rn -1
 11   group by r2.STDID,r2.term;

STDID      TERM    LOCATION
---------- ------- ----------
00181      2001/2  DL MC
00002      2001/2  MC

 

5 stars Reg: A pure SQL solution for the original "STRAGG" question   July 26, 2006 - 7pm Central time zone
Reviewer: ST from MI USA
Thanks for your quick response and appreciate your help. But the problem still persists using the 
script provided by you and I am still getting 3 MC's for stdid = '00002' and so on for other 
stdid's. I am on Oracle 9.2.0.4 Does it make any difference with this version? 


Followup   July 26, 2006 - 7pm Central time zone:

there appears to be a bug - please contact support, as I said the data is definitely munged up 
there.

If you go back to your original source data, you'll find there are not 3 MC's for that stdid, it is 
a bug. 

5 stars   July 26, 2006 - 7pm Central time zone
Reviewer: ST from MI USA
Thank you very much and appreciate for all your help. I will check back with the support on this 
issue.  


5 stars STRAGG   July 27, 2006 - 2pm Central time zone
Reviewer: ST from MI USA
OK Now I was able to get the concatenated output using the STRAGG. Another issue with using STRAGG 
is I am unable to sort based on START WITH condition. Is there any way we can put conditions so 
that we can concatenate the sorted values in the string as per the requirements?

 


Followup   July 27, 2006 - 4pm Central time zone:

http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:15637744429336
there are examples there with sorts. 

4 stars Stragg with clob and free memory   October 2, 2006 - 11am Central time zone
Reviewer: Ron from Quebec, Canada
Hi Tom,

we modify the fonction stragg for Clob variable in output.  We had some problems with the clob free 
memory (clob are not free of the memory).

Here is the code of the new clob fonction:

drop function stragg_clobs;
drop type stragg_clob;



create or replace type stragg_clob as object
(
total clob,
static function
ODCIAggregateInitialize(sctx IN OUT stragg_clob )
return number,
member function
ODCIAggregateIterate(self IN OUT stragg_clob ,value IN varchar2 )
return number,
member function    
ODCIAggregateTerminate(self IN OUT stragg_clob,
                              returnValue OUT  clob,
                             flags IN number)
return number, 
member function
ODCIAggregateMerge(self IN OUT stragg_clob,
                        ctx2 IN stragg_clob)
return number
);
/

--Type created.

create or replace type body stragg_clob
is
static function ODCIAggregateInitialize(sctx IN OUT stragg_clob)
return number
is
begin
sctx := stragg_clob( null );
return ODCIConst.Success;
end;
member function ODCIAggregateIterate(self IN OUT stragg_clob,
                                       value IN varchar2 )
return number
is
begin
self.total := self.total || ',' || value; 
return ODCIConst.Success;
end;      

member function ODCIAggregateTerminate(self IN OUT stragg_clob,
                                        returnValue OUT clob,
                                        flags IN number)
return number
is
begin
returnValue := ltrim(self.total,',');
dbms_lob.freetemporary(self.total);
return ODCIConst.Success;
end;
member function ODCIAggregateMerge(self IN OUT stragg_clob,
                                    ctx2 IN stragg_clob)
return number
is
begin
self.total := self.total || ctx2.total;

return ODCIConst.Success;
end;
end;
/
show error
--Type body created.


     CREATE or replace
FUNCTION stragg_clobs(input varchar2 )
RETURN clob
PARALLEL_ENABLE AGGREGATE USING stragg_clob;
/



grant execute on stragg_clob to public;
--

grant execute on stragg_clobs to public;
--
drop  public synonym stragg_clobs ;
create public synonym stragg_clobs for stragg_clobs;

if we do some query with this new fonction, we can see the memory is not free between each 
execution.

We use 

select * from v$temporary_lobs

to check the cache_lobs.

If we use this new fonction with a lot of data, Oracle return us popup with the message  "out of 
memory"

How can we free the memory so we won't have this message

Thanks
Ron  


Followup   October 2, 2006 - 11am Central time zone:

is the client letting the clobs be freed? (eg: freeing the clob).  You are sort of letting them 
copy the clob when they fetch it (think about what would happen if YOU closed THEIR clob - say I 
called your function and it returned to me 10 rows.  I have now 10 clobs in my client program, but 
you think you "closed" them??)

 

4 stars how to get this   November 8, 2006 - 7pm Central time zone
Reviewer: Rahul from Columbus
i have a table containing 2 columns

position   component_item 
1           786
1           787
0           789
1           790
1

Now based on the position i have to construct a string like

786-787-null-789-790

whereever there is zero in the position i have to put a null value there and then construct a 
string

the position 0 can be any where in the outmost five records 
and the component item is going to be 4 
so i have to buid the string like given above  


Followup   November 8, 2006 - 8pm Central time zone:

use stragg on 

decode( position, 0, 'null-', '' ) || component_item 

5 stars A 10G alternative to my 9I pure SQL STRAGG soultion   November 22, 2006 - 3pm Central time zone
Reviewer: Frank Zhou from Braintree, MA
Hi Tom,

      Here is a 10G SQL Model Clause alternative to my 9i pure sql STRAGG soultion posted on 
January 16, 2006. SQL Model Clause is really awsome!! This is an other SQL Evolution...  

Thanks,

Frank

SQL> select customer,Prod_Str, order_str, OPERATOR_str
  2  from
  3   (select customer, Prod_Str, OPERATOR_str, order_str, rn, counter
  4   from stage
  5   model
  6   partition by (customer)
  7   dimension by (row_number()over(PARTITION BY customer ORDER BY rowid) rn )
      measures (product, order_date, operator, CAST(NULL AS VARCHAR2(3999)) 
               Prod_Str, CAST(NULL AS VARCHAR2(3999)) order_str, 
               CAST(NULL AS VARCHAR2(3999)) OPERATOR_str,
 10   count(*) over (partition by customer) counter
 11   )
 12   RULES (
 13   Prod_Str[ANY] ORDER BY rn  =
 14                 CASE WHEN product[cv() - 1 ] IS NULL
 15            THEN product[cv()]
 16                     ELSE Prod_Str[cv()-1]||','|| product[cv()]
 17        END ,
 18   order_str[ANY] ORDER BY rn  =
 19                 CASE WHEN order_date[cv() - 1 ] IS NULL
 20            THEN order_date[cv()]
 21                     ELSE order_str[cv()-1]||','|| order_date[cv()]
 22        END ,
 23   OPERATOR_str[ANY] ORDER BY rn  =
 24                 CASE WHEN operator[cv() - 1 ] IS NULL
 25            THEN operator[cv()]
 26                     ELSE OPERATOR_str[cv()-1]||','|| operator[cv()]
 27        END
 28    )
 29   )
 30  where counter = rn;

CU                                                                              
--                                                                              
PROD_STR                                                                        
--------------------------------------------------------------------------------
ORDER_STR                                                                       
--------------------------------------------------------------------------------
OPERATOR_STR                                                                    
--------------------------------------------------------------------------------
C1                                                                              
P1,P2,P3,P1                                                                     
D1,D2,D3,D4                                                                     
OP1,OP2,OP3,OP3                                                                 
                                                                                

CU                                                                              
--                                                                              
PROD_STR                                                                        
--------------------------------------------------------------------------------
ORDER_STR                                                                       
--------------------------------------------------------------------------------
OPERATOR_STR                                                                    
--------------------------------------------------------------------------------
C2                                                                              
P2,P3                                                                           
D1,D2                                                                           
OP2,OP4                                                                         
                                                                                

CU                                                                              
--                                                                              
PROD_STR                                                                        
--------------------------------------------------------------------------------
ORDER_STR                                                                       
--------------------------------------------------------------------------------
OPERATOR_STR                                                                    
--------------------------------------------------------------------------------
C3                                                                              
P1,P1                                                                           
D2,D3                                                                           
OP1,OP1                                                                         
                                                                                 


5 stars Re: A 10G alternative to my 9I pure SQL STRAGG soultion   November 27, 2006 - 9am Central time zone
Reviewer: Laurent Schneider from Switzerland
a nice use of the MODEL clause 


4 stars Another MODEL Solution   February 22, 2007 - 1pm Central time zone
Reviewer: SnippetyJoe from Toronto, ON, Canada

In case anyone is interested, here's another way to use MODEL to perform string aggregation.

create table t6( group_key varchar2(10), val varchar2(10) );

insert into t6 values ( 'Group 1' , 'a'  );
insert into t6 values ( 'Group 2' , 'a'  );
insert into t6 values ( 'Group 2' , 'b'  );
insert into t6 values ( 'Group 3' , 'a'  );
insert into t6 values ( 'Group 3' , 'b'  );
insert into t6 values ( 'Group 3' , 'c'  );
insert into t6 values ( 'Group 4' , 'a'  );
insert into t6 values ( 'Group 4' , 'a'  );
insert into t6 values ( 'Group 4' , 'b'  );
insert into t6 values ( 'Group 4' , 'b'  );
insert into t6 values ( 'Group 5' , 'a'  );
insert into t6 values ( 'Group 5' , 'b'  );
insert into t6 values ( 'Group 5' , null );
insert into t6 values ( 'Group 5' , 'd'  );
insert into t6 values ( 'Group 5' , 'e'  );
insert into t6 values ( 'Group 6' , null );

commit;

column group_key format a10
column string    format a15

select
  group_key ,
  substr( string, 2 ) as string
from
  dual
where
  1 = 2
model
  reference t6_ref
    on
    (
      select
        row_number() over ( order by val ) - 1 as row_num ,
        count(*) over () - 1                   as max_row_num ,
        group_key ,
        val
      from
        t6
      where
        val is not null
      order by
        val
    )
    dimension by( row_num )
    measures    ( max_row_num, group_key, val )
  main t6_main
    dimension by ( cast( null as varchar2(4000) ) as group_key )
    measures     ( cast( null as varchar2(4000) ) as string )
    rules
      iterate( 4294967295 )
      until
      ( t6_ref.max_row_num[0] is null or
        iteration_number >= t6_ref.max_row_num[0]
      )
    (
      string[ t6_ref.group_key[iteration_number] ] =
        string[ cv() ] || ',' ||
        t6_ref.val[ iteration_number ]
    )
order by
  1
;

GROUP_KEY  STRING
---------- ---------------
Group 1    a
Group 2    a,b
Group 3    a,b,c
Group 4    a,a,b,b
Group 5    a,b,d,e


You can find a detailed explanation of how this query works here.
http://www.sqlsnippets.com/en/topic-11754.html

A performance comparison between this method and hierarchical, object relational, and STRAGG solutions is available here.
http://www.sqlsnippets.com/en/topic-11783.html



5 stars Using STRAGG in a child table join   March 23, 2007 - 9am Central time zone
Reviewer: Bill Ferguson from Denver, CO
Hi Tom,

Your stragg function is great, but I am having difficulties figuring out how to use it in a join for child tables with multiple entries. I can use it on one table without problems, but I can't figure out how to use it with a join on child tables.

I would like the parent table information with the child table information STRAGG'ed.

My (abbreviated) tables are:
CREATE TABLE DEPOSITS_BASE
(
  DEP_ID            NUMBER(12)                  NOT NULL,
  DEV_ST            VARCHAR2(25 BYTE)           NOT NULL,
  OPER_TP           VARCHAR2(30 BYTE)           NOT NULL,
  SIG               VARCHAR2(1 BYTE));
/

Insert into DEPOSITS_BASE (DEP_ID,DEV_ST,OPER_TP,SIG) 
   values (55555555,'Past Producer','Underground','N');
/


CREATE TABLE COMMODITY_BASE
(
  DEP_ID       NUMBER(12)                       NOT NULL,
  LINE         NUMBER(4)                        NOT NULL,
  COMMOD_GROUP VARCHAR2(40 BYTE)                NOT NULL);
/

Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,5,'Abrasives');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,2,'Tungsten');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,4,'Copper');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,1,'Gold');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,3,'Silver');
/

CREATE TABLE LOCATIONS_BASE
(
  DEP_ID       NUMBER(12)                       NOT NULL,
  LINE         NUMBER(4)                        NOT NULL,
  COUNTRY      VARCHAR2(70 BYTE),
  STATE_PROV   VARCHAR2(80 BYTE),
  COUNTY       VARCHAR2(80 BYTE) );
/

Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,1,'United States','Colorado','Adams');
Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,3,'United States','Montana','Deer Lodge');
Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,2,'United States','Nevada','White Pine');
/


The DEP_ID field of course is the common link between the tables.

Ideally I would like a one line output (per DEP_ID) of the STRAGG'ed output sorted by LINE as well, as that signifies importance to the end users. So, with the above sample data, what the end-users would like would be all the data from DEPOSITS_BASE, the STRAGG'ed COMMOD_GROUP, and somehow the STRAGG'ed output of COUNTRY, STATE_PROV, and COUNTY.

I'm completely lost trying to figure out how to accomplish this, if it even can be.

Thanks,
Bill Ferguson

Followup   March 23, 2007 - 10am Central time zone:

child tables have nothing to do with it - it is like using avg, sum, min, max.

It is just an aggregate function.


How about this - you supply the query to join the above tables together and then, we'll just wrap that query (call it Q) in ()


select *
from (Q)


Now, we can just

select column, stragg(another_column)
from (Q)
group by column;


5 stars Using STRAGG in a child table join (revisited)   March 23, 2007 - 11am Central time zone
Reviewer: Bill Ferguson from Denver, CO
Hi again Tom,

I must be a real dummy. I just can't seem to get it to work correctly with three tables.

Doing the following (with two tables) works fine:

SELECT dep_id, stringagg(commod_group) commod_group --, stringagg(country) country,
   --stringagg(state_prov) state_prov, stringagg(county) county
FROM (
SELECT dep_id, dev_st, oper_tp, sig, d.NAME, commod_group --,
--       country, state_prov, county
FROM ( --( 
 deposits d LEFT OUTER JOIN commodity c USING (dep_id))
--   LEFT OUTER JOIN locations l USING (dep_id))
WHERE dep_id = 55555555   
)
GROUP BY dep_id


I get the one line output with the commod_group aggregated together.

When I remove the '--' comment markers and re-run it, then I get back multiples of each commod_group, one for each entry in the locations_base table.

I'll get back 3 Gold's, 3 Silver's, etc. I can see why, with the join creating a line for each combination in the enclosed query, but I can't figure out how to (I guess) get a distinct stringagg output per column.


Followup   March 23, 2007 - 11am Central time zone:

give us your query table query
5 stars Using STRAGG in a child table join (revisited again)   March 23, 2007 - 1pm Central time zone
Reviewer: Bill Ferguson from Denver, CO
Okay, I guess maybe these questions are sent as emails, so seeing the previous 'post' is difficult?

Anyway, I'll consolidate the above two posts into one.

I would like the parent table information with the child table information STRAGG'ed.

My (abbreviated) tables are:
CREATE TABLE DEPOSITS_BASE
(
  DEP_ID            NUMBER(12)                  NOT NULL,
  DEV_ST            VARCHAR2(25 BYTE)           NOT NULL,
  OPER_TP           VARCHAR2(30 BYTE)           NOT NULL,
  SIG               VARCHAR2(1 BYTE));
/

Insert into DEPOSITS_BASE (DEP_ID,DEV_ST,OPER_TP,SIG) 
   values (55555555,'Past Producer','Underground','N');
/


CREATE TABLE COMMODITY_BASE
(
  DEP_ID       NUMBER(12)                       NOT NULL,
  LINE         NUMBER(4)                        NOT NULL,
  COMMOD_GROUP VARCHAR2(40 BYTE)                NOT NULL);
/

Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,5,'Abrasives');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,2,'Tungsten');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,4,'Copper');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,1,'Gold');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,3,'Silver');
/

CREATE TABLE LOCATIONS_BASE
(
  DEP_ID       NUMBER(12)                       NOT NULL,
  LINE         NUMBER(4)                        NOT NULL,
  COUNTRY      VARCHAR2(70 BYTE),
  STATE_PROV   VARCHAR2(80 BYTE),
  COUNTY       VARCHAR2(80 BYTE) );
/

Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,1,'United States','Colorado','Adams');
Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,3,'United States','Montana','Deer Lodge');
Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,2,'United States','Nevada','White Pine');
/


Ideally I would like a one line output (per DEP_ID) of the STRAGG'ed output sorted by LINE as well, as that signifies importance to the end users. So, with the above sample data, what the end-users would like would be all the data from DEPOSITS_BASE, the STRAGG'ed COMMOD_GROUP, and somehow the STRAGG'ed output of COUNTRY, STATE_PROV, and COUNTY.

If I do:
SELECT dep_id, stringagg(commod_group) commod_group --, stringagg(country) country,
   --stringagg(state_prov) state_prov, stringagg(county) county
FROM (
SELECT dep_id, dev_st, oper_tp, sig, d.NAME, commod_group --,
--       country, state_prov, county
FROM ( --( 
 deposits_base d LEFT OUTER JOIN commodity_base c USING (dep_id))
--   LEFT OUTER JOIN locations_base l USING (dep_id))
WHERE dep_id = 55555555   
)
GROUP BY dep_id;


It works fine. The commod_group is STRAGG'ed together as expected. If I remove the comment markers and rerun it, then the commod_group, country, state_prov, and county entries are repeated for each additional join against LOCATIONS_BASE.

Followup   March 26, 2007 - 6am Central time zone:

well, I don't believe this to be a "stragg" problem at all - but rather just a "my data model doesn't support my question" (replace stragg with ANY OTHER AGGREGATE like min or max and you'll see what I mean)

You have what appears to be a 1:m + 1:m relationship here.

Say you have a relation DEPT TO EMP - that is one to many. And you have a deptno with 5 employees - you expect "5 rows"

Say you have a relation DEPT TO DEPT_MANAGERS and it is one to many. And that deptno with 5 employess - has 2 managers.

Say you join dept to emp and dept to dept_managers in a single query. You get 10 rows (5x2) for that deptno.

There are no duplicates, you are just getting the natural result of two 1:m relations in a single query.

I cannot get your query to run - some word wrapping going on. Best not to be tricky with comments like that - not sure what goes on which line from this end. I see the last comment really only - I'd have to page up and up and up to rebuild any sort of context.

But, this is not stragg - this is just a natural side effect of your data. That is what I asked for a query without aggregates that had the right data - sort of thinking "such a query doesn't exist"
3 stars Another way of using SQL modeling   April 19, 2007 - 6pm Central time zone
Reviewer: Charlie Zhu from Victoria, BC Canada
http://tkyte.blogspot.com/2006/08/evolution.html

http://asktom.oracle.com/pls/ask/f?p=4950:8:13056644700682648402::NO::F4950_P8_DISPLAYID,F4950_P8_CR
ITERIA:15637744429336


String Aggregate, Another way of using SQL modeling.

3 steps to illustrate it:

step 1,

select job, ename,
row_number() over ( partition by job order by ename ) as position
, count(*) over ( partition by job ) emp_cnt
from scott.emp;

JOB    ENAME    POSITION  EMP_CNT
--------- ---------- ---------- ----------
ANALYST  FORD          1      2
ANALYST  SCOTT          2      2
CLERK  ADAMS          1      4
CLERK  JAMES          2      4
CLERK  MILLER        3      4
CLERK  SMITH          4      4
MANAGER  BLAKE          1      3
MANAGER  CLARK          2      3
MANAGER  JONES          3      3
PRESIDENT KING          1      1
SALESMAN ALLEN          1      4
SALESMAN MARTIN        2      4
SALESMAN TURNER        3      4
SALESMAN WARD          4      4

14 rows selected.

Elapsed: 00:00:00.04

Execution Plan
----------------------------------------------------------
Plan hash value: 3145491563

---------------------------------------------------------------------------
| Id | Operation      | Name | Rows | Bytes | Cost (%CPU)| Time  |
---------------------------------------------------------------------------
|  0 | SELECT STATEMENT  |    |  14 |  196 |  3 (34)| 00:00:01 |
|  1 | WINDOW SORT    |    |  14 |  196 |  3 (34)| 00:00:01 |
|  2 |  TABLE ACCESS FULL| EMP |  14 |  196 |  2  (0)| 00:00:01 |
---------------------------------------------------------------------------
step 2,

select job, empname, position, emp_cnt, enames from
(
select job, ename,
row_number() over ( partition by job order by ename ) as position
, count(*) over ( partition by job ) emp_cnt
from scott.emp
)
MODEL
RETURN UPDATED ROWS
MAIN simple_model
PARTITION BY (job)
DIMENSION BY (position)
MEASURES (ename empname, ' ' enames, emp_cnt emp_cnt)
RULES
  (enames[any] = empname[CV()] ||','||enames[CV()-1]
  )
;

JOB    EMPNAME    POSITION  EMP_CNT ENAMES
--------- ---------- ---------- ---------- --------------------------------
CLERK  ADAMS          1      4 ADAMS,
CLERK  JAMES          2      4 JAMES,ADAMS,
CLERK  MILLER        3      4 MILLER,JAMES,ADAMS,
CLERK  SMITH          4      4 SMITH,MILLER,JAMES,ADAMS,
SALESMAN ALLEN          1      4 ALLEN,
SALESMAN MARTIN        2      4 MARTIN,ALLEN,
SALESMAN TURNER        3      4 TURNER,MARTIN,ALLEN,
SALESMAN WARD          4      4 WARD,TURNER,MARTIN,ALLEN,
PRESIDENT KING          1      1 KING,
MANAGER  BLAKE          1      3 BLAKE,
MANAGER  CLARK          2      3 CLARK,BLAKE,
MANAGER  JONES          3      3 JONES,CLARK,BLAKE,
ANALYST  FORD          1      2 FORD,
ANALYST  SCOTT          2      2 SCOTT,FORD,

14 rows selected.

Elapsed: 00:00:00.04

Execution Plan
----------------------------------------------------------
Plan hash value: 2296444387

-----------------------------------------------------------------------------
| Id | Operation        | Name | Rows | Bytes | Cost (%CPU)| Time  |
-----------------------------------------------------------------------------
|  0 | SELECT STATEMENT  |    |  14 |  546 |  3 (34)| 00:00:01 |
|  1 | SQL MODEL ORDERED  |    |  14 |  546 |        |      |
|  2 |  VIEW          |    |  14 |  546 |  3 (34)| 00:00:01 |
|  3 |  WINDOW SORT    |    |  14 |  196 |  3 (34)| 00:00:01 |
|  4 |  TABLE ACCESS FULL| EMP |  14 |  196 |  2  (0)| 00:00:01 |
-----------------------------------------------------------------------------

step 3,

select * from
(
select job, empname, position, emp_cnt, enames from
(
select job, ename,
row_number() over ( partition by job order by ename ) as position
, count(*) over ( partition by job ) emp_cnt
from scott.emp
)
MODEL
RETURN UPDATED ROWS
MAIN simple_model
PARTITION BY (job)
DIMENSION BY (position)
MEASURES (ename empname, ' ' enames, emp_cnt emp_cnt)
RULES
  (enames[any] = empname[CV()] ||Nvl2(enames[CV()-1],',','')||enames[CV()-1]
  )
)
where position = emp_cnt
;

JOB    EMPNAME    POSITION  EMP_CNT ENAMES
--------- ---------- ---------- ---------- --------------------------------
CLERK  SMITH          4      4 SMITH,MILLER,JAMES,ADAMS
SALESMAN WARD          4      4 WARD,TURNER,MARTIN,ALLEN
PRESIDENT KING          1      1 KING
MANAGER  JONES          3      3 JONES,CLARK,BLAKE
ANALYST  SCOTT          2      2 SCOTT,FORD

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3377437572

------------------------------------------------------------------------------
| Id | Operation        | Name | Rows | Bytes | Cost (%CPU)| Time  |
------------------------------------------------------------------------------
|  0 | SELECT STATEMENT    |    |  14 |  588 |  3 (34)| 00:00:01 |
|* 1 | VIEW          |    |  14 |  588 |  3 (34)| 00:00:01 |
|  2 |  SQL MODEL ORDERED  |    |  14 |  546 |        |      |
|  3 |  VIEW          |    |  14 |  546 |  3 (34)| 00:00:01 |
|  4 |  WINDOW SORT    |    |  14 |  196 |  3 (34)| 00:00:01 |
|  5 |    TABLE ACCESS FULL| EMP |  14 |  196 |  2  (0)| 00:00:01 |
------------------------------------------------------------------------------


5 stars stragg   September 4, 2007 - 3pm Central time zone
Reviewer: Jay from Herndon, VA
Hello Tom,
Thanks a bunch for letting us know about the stragg function. Especially, I love this revision that you have made where it sorts the aggregated data. When you get a chance, could you please explain to me what you are doing here?? I'm like the Alice who's lost in the wonderland here.

I would be very thankful to you if you could please guide me Tom. Thank you so much and I apologize for the bother!

ops$tkyte@ORA9IR2> create or replace type vcArray as table of varchar2(4000)
  2  /
Type created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace type string_agg_type as object
  2  (
  3     data  vcArray,
  4
  5     static function
  6          ODCIAggregateInitialize(sctx IN OUT string_agg_type )
  7          return number,
  8
  9     member function
 10          ODCIAggregateIterate(self IN OUT string_agg_type ,
 11                               value IN varchar2 )
 12          return number,
 13
 14     member function
 15          ODCIAggregateTerminate(self IN string_agg_type,
 16                                 returnValue OUT  varchar2,
 17                                 flags IN number)
 18          return number,
 19
 20     member function
 21          ODCIAggregateMerge(self IN OUT string_agg_type,
 22                             ctx2 IN string_agg_type)
 23          return number
 24  );
 25  /
 
Type created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace type body string_agg_type
  2  is
  3
  4  static function ODCIAggregateInitialize(sctx IN OUT string_agg_type)
  5  return number
  6  is
  7  begin
  8      sctx := string_agg_type( vcArray() );
  9      return ODCIConst.Success;
 10  end;
 11
 12  member function ODCIAggregateIterate(self IN OUT string_agg_type,
 13                                       value IN varchar2 )
 14  return number
 15  is
 16  begin
 17      data.extend;
 18      data(data.count) := value;
 19      return ODCIConst.Success;
 20  end;
 21
 22  member function ODCIAggregateTerminate(self IN string_agg_type,
 23                                         returnValue OUT varchar2,
 24                                         flags IN number)
 25  return number
 26  is
 27      l_data varchar2(4000);
 28  begin
 29      for x in ( select column_value from TABLE(data) order by 1 )
 30      loop
 31              l_data := l_data || ',' || x.column_value;
 32      end loop;
 33      returnValue := ltrim(l_data,',');
 34      return ODCIConst.Success;
 35  end;
 36
 37  member function ODCIAggregateMerge(self IN OUT string_agg_type,
 38                                     ctx2 IN string_agg_type)
 39  return number
 40  is
 41  begin -- not really tested ;)
 42      for i in 1 .. ctx2.data.count
 43      loop
 44              data.extend;
 45              data(data.count) := ctx2.data(i);
 46      end loop;
 47      return ODCIConst.Success;
 48  end;
 49
 50
 51  end;
 52  /
 
Type body created.
 
ops$tkyte@ORA9IR2> show err
No errors.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> CREATE or replace
  2  FUNCTION stragg(input varchar2 )
  3  RETURN varchar2
  4  PARALLEL_ENABLE AGGREGATE USING string_agg_type;
  5  /
 
Function created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> column ename format a40
ops$tkyte@ORA9IR2> select deptno, stragg(ename) ename
  2    from emp
  3   group by deptno
  4  /
 
    DEPTNO ENAME
---------- ----------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD



Followup   September 5, 2007 - 1pm Central time zone:

which part needs to be explained?
5 stars thanks for the response..   September 5, 2007 - 3pm Central time zone
Reviewer: Jay from Herndon, VA
Thanks for the response Tom. Just a brief overview of the logic you've incorporated here would be very helpful.
How can I start thinking to come up with a logic like that in the creation of type body? How do I know when to use the ODCIAggregateTerminate, Initialize, Merge etc, functions. Where do I get a list of complete functions such as this? How do I decide when to do a type body or a function or a package or a stored procedure? I am sorry for asking some really weird questions, but, I have no prior experience with the pl/sql programming and want to get better at it. I am just amazed by this site and your skills. Thanks a lot Tom. You are a gift to the oracle world!

Followup   September 5, 2007 - 5pm Central time zone:

ahh, ok:

http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14289/dciaggfns.htm#sthref542



5 stars Thanks!   September 6, 2007 - 9am Central time zone
Reviewer: Jay from Herndon, VA
Thanks for your time and the link Tom.
Have a wonderful day!


4 stars sorted stragg   December 10, 2007 - 11am Central time zone
Reviewer: Sean from Columbus, OH USA
I've seen many mentionings of aggregates applying to SETS, which is why ordering doesn't apply.  I 
understand that.

But, what about analytics?

stragg is usable as an analytic as well as an aggregate.

I've used the following syntax with stragg on many occasions successfully.  Have I just been 
getting lucky?

SELECT DISTINCT deptno,
                stragg(ename) OVER(PARTITION BY deptno 
                                   ORDER BY ename DESC 
                                   ROWS BETWEEN 
                                        UNBOUNDED PRECEDING
                                        AND 
                                        UNBOUNDED FOLLOWING)    AS enames
FROM emp

If this is a bad idea, please explain why and also explain how this could be accomplished with the 
connect by trick.  

Thanks!


Followup   December 10, 2007 - 12pm Central time zone:

it works - it is not the most "efficient" method, but it works.
5 stars not working for me :-(   December 10, 2007 - 3pm Central time zone
Reviewer: Jay from Herndon, VA
Hi Tom,
Could you please tell me why the sorting wont work for me?
with test as
(select '12345' as ids, 21 as code from dual             
 union all
 select '12345' as ids, 26 as code from dual                
 union all
 select '12345' as ids, 28 as code from dual)
 
select distinct ids,
       stragg(code) over (partition by ids order by code desc 
                          rows between unbounded preceding and unbounded following) as code
  from test 


I get '21,26,28' as the output. Shouldn't it be '28,26,21'? Am I missing something?

Thank you for your help as always!

Jay

Followup   December 10, 2007 - 4pm Central time zone:

ops$tkyte%ORA9IR2> with test as
  2  (select '12345' as ids, 21 as code from dual
  3   union all
  4   select '12345' as ids, 26 as code from dual
  5   union all
  6   select '12345' as ids, 28 as code from dual)
  7  select distinct ids,
  8         stragg(code) over (partition by ids order by code desc
  9                            rows between unbounded preceding and unbounded following) as code
 10    from test
 11  /

IDS   CODE
----- --------------------
12345 28,26,21


I get that in 9i, 10g, and 11g...



5 stars Oh my God! Spoke to soon...   December 10, 2007 - 3pm Central time zone
Reviewer: Jay from Herndon, VA
Sorry Tom. I found out the issue. I used the stragg script that does the sorting.

member function ODCIAggregateTerminate(self IN string_agg_type,
returnValue OUT varchar2,
flags IN number)
return number
is
l_data varchar2(4000);
begin
for x in ( select column_value from TABLE(data) order by 1 )
loop
l_data := l_data || ', ' || x.column_value;
end loop;
returnValue := ltrim(l_data,',');
return ODCIConst.Success;
end;

This was the problem :-(

Sorry,
Jay

5 stars thanks Tom   December 10, 2007 - 4pm Central time zone
Reviewer: Jay from Herndon, VA
Yes Tom. You are right.
Sorry for wasting your time. I almost immediately replied to that question myself with the issue.
Thanks again!
Jay

5 stars Question [out of curiosity]   December 11, 2007 - 1pm Central time zone
Reviewer: Jay from Herndon, VA
Hi Tom,
Good day! In one of your comments you had said this regarding the sorting of the stragg function.

"Followup  December 10, 2007 - 12pm US/Eastern:
it works - it is not the most "efficient" method, but it works. "

Could you please let us know if there is a more efficient way to do the same when you have some time?

Thanks a lot for your help!
Jay


Followup   December 11, 2007 - 9pm Central time zone:

read original answer?
5 stars thanks!   December 12, 2007 - 4pm Central time zone
Reviewer: Jay from Herndon, VA
Thanks a lot Tom.

Can you please clarify one thing for me? The other user posted a way to sort using analytics.

code from the other user:

SELECT DISTINCT deptno,
                stragg(ename) OVER(PARTITION BY deptno 
                                   ORDER BY ename DESC 
                                   ROWS BETWEEN 
                                        UNBOUNDED PRECEDING
                                        AND 
                                        UNBOUNDED FOLLOWING)    AS enames
FROM emp


Now, instead of doing analytics and distinct, can I just do the following query to sort? This is more simple and I think it will give me the right data. Can you please advice?

My code:

select deptno,
       stragg(ename)
  from (select deptno,                         
               ename
          from emp
        order by deptno,
                 ename desc)
group by deptno  


thanks for your time!

Jay

Followup   December 13, 2007 - 9am Central time zone:

no, you cannot - the data fed to the aggregate is fed in a non-determined fashion - we can do a hash group by for example...

it might appear to work by accident - but it would not be assured.
5 stars thanks   December 13, 2007 - 9am Central time zone
Reviewer: Jay from Herndon, VA
Okay. Thanks for your reply!

I am going to be attending the conference this evening in Reston Tom! I can't wait to see you talk (LIVE, In-Person). W0W! It's going to be great.

Thanks for the opportunity! I already made my reservation. Woohooooooo...

Thanks!
Jay

5 stars awesome presentation..   December 14, 2007 - 9am Central time zone
Reviewer: Jay from Herndon, VA
Hi Tom,

It was a great presentation yesterday at the reston oracle center. It was indeed a pleasure to see 
you in person and hear you talk! 

I have one quick suggestion. You can completely ignore this comment though ;-)

If future presentations, I think it would be really helpful for everyone if you please ask the 
audience to wait until the end of the session for the questions? There were just way too many 
questions (by the same two or three 'smartass-know-it-alls' in the audience) during the session and 
thus, you couldn't even get through half of the slides. 

Just a thought :-) 

Thanks again!

Jay


Followup   December 14, 2007 - 1pm Central time zone:

I give seminars - they last 2 to 3 days. The questions are mandatory - I'd be done in a day if it were not for them.

The questions are really pretty good actually, I was commenting to a friend afterwards that I thought the questions were actually quite good - David Aldridge was the fellow that asked me a question that I said "I don't know" to (about encrypted tablespaces and transporting) :) and he emailed me this morning:

...
David Aldridge wrote:
> Documented as a restriction, of course.
>
>
http://download.oracle.com/docs/cd/B28359_01/server.111/b28310/tspaces013.htm#i1007233

>
> It's interesting that one apparent advantage of tablespace encryption is that you can transport them, albeit with restrictions, whereas non-encrypted tablespaces with tables having encrypted columns cannot be. That might be a valid reason to choose TS encryption over column encryption.
....

So find the questions good - I encourage them.

but yes, I did not expect to spend 25 minutes on my first slide :) that did put us behind schedule a bit!
5 stars Stragg   April 15, 2008 - 10am Central time zone
Reviewer: DK from Dallas
Hi Tom

Thank you for all the knowledge you have been sharing .

I have been trying to use stragg for converting rows to columns

I have the following table and data

create table listingtest (listingid varchar2(100),categoryid varchar2(20),categoryname varchar2(300))

tablespace vztables2;

insert into listingtest values('0001','5000','Taxis');

insert into listingtest values('0001','5001','Limousine service');

insert into listingtest values('0001','5002','Transportation service');

insert into listingtest values('0002','5004','ATM servcies');

insert into listingtest values('0002','50005','Banking services');

insert into listingtest values('0003','5000','Taxis');

insert into listingtest values('0003','5004','ATM services');

select * from listingtest

LISTINGID          CATEGORYID      CATEGORYNAME                                                                                             
------------------------- -------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------
0001              5000          Taxis                                                                                                 
0001              5001          Limousine service                                                                                         
0001              5002          Transportation service                                                                                     
0002              5004          ATM servcies                                                                                             
0002              50005          Banking services                                                                                         
0003              5000          Taxis                                                                                                 
0003              5004          ATM services                                                                                             


I used stragg to get

select listingid,stragg(categoryid) categoryid,stragg(categoryname) categoryname from listingtest group

by listingid
LISTINGID          CATEGORYID          CATEGORYNAME                           

       
------------------------- -------------------------

--------------------------------------------------------------------
0001              5000,5001,5002        Taxis,Transportation service,Limousine service   

       
0002              5004,50005          ATM servcies,Banking services               

       
0003              5000,5004          Taxis,ATM services   

I need something like

LISTINGID CATEGORYID CATEGORYNAME  CATEGORYID CATEGORYNAME      CATEGORYID CATEGORYNAME  CATEGORYID CATEGORYNAME

0001    5000      Taxis      5001    Transportation service  5002    Limousine service     

0002    5004      ATM servcies  50005  Banking services

0003    5000      Taxis      5004  ATM services


Please help

thanks
DK



Followup   April 16, 2008 - 3pm Central time zone:

search this site for PIVOT


5 stars can this be done in msaccess   September 24, 2008 - 5am Central time zone
Reviewer: Duke from Amsterdam, nl
Tom, so so so sorry to have to ask this;
my project leader want to do this in a small access database.
I thought about defining hetrogenous services and do it with stragg, but he said he want to be able 
to do this from access. I can't figure out how to do this in access, and i doubt wheter i want to 
do this, but he asked me.
So once again sorry to ask, but do you know of a smiliar concept in access?


Followup   September 24, 2008 - 7pm Central time zone:

I am sorry - I do not know :)
5 stars stragg ora-06502   October 7, 2008 - 5pm Central time zone
Reviewer: Jay from Herndon, VA
Hello Tom,
The function 'stragg' has been very helpful indeed. For the first time, I get the following error 
with this function

ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at "STRING_AGG_TYPE", line 31

Line 31 refers to the following piece of code in the type body -

l_data := l_data || ', ' || x.column_value;

Could you please let me know what could be going wrong? 

Thanks for your time!
Jay



Followup   October 8, 2008 - 9pm Central time zone:

stragg as I defined it work with varchar2(4000)

you must need more

but then you are into the world of clobs. search - you'll find an implementation that does clobs - but I would recommend against it, the amount of data is probably just too long at that point.
4 stars order and ODCIAggregateMerge   April 17, 2009 - 5am Central time zone
Reviewer: Alex from Kiev
Hi, Tom!
Does "order by" in "stragg(...) over (order by ..)" guarantee that result will be ordered (in case 
PARALLEL_ENABLE stragg).
I mean is there a guarantee that ODCIAggregateMerge will merge thread results in correct 
(sort-wise) order?
or it can works like this:
id     thread      stragg
1      1           1 
2      1,2         1,2
3      3           3,1,2
4      3,4         3,4,1,2
5      5           3,4,1,2,5
6      5,6         3,4,1,2,5,6


Followup   April 17, 2009 - 9am Central time zone:

no, unless and until there is an order by on the STATEMENT, the rows in the result set can and will be returned in any order we feel like returning them.


But I'm not sure you are talking about the STATEMENT here - and given I have no idea what your example represents - I cannot really answer you...
4 stars order and ODCIAggregateMerge   April 17, 2009 - 11am Central time zone
Reviewer: Alex from Kiev
Ok. Let me ask this another way.

For example:
  1  select deptno,
  2  stragg(ename) over(partition by deptno order by ename) s,
  3  row_number() over(order by deptno,ename) rn
  4*  from emp
SQL> /

   DEPTNO S                                               RN
--------- ---------------------------------------- ---------
       10 CLARK                                            1
       10 CLARK,KING                                       2
       10 CLARK,KING,MILLER                                3
       20 ADAMS                                            4
       20 ADAMS,FORD                                       5
       20 ADAMS,FORD,JONES                                 6
       20 ADAMS,FORD,JONES,SCOTT                           7
       20 ADAMS,FORD,JONES,SCOTT,SMITH                     8
       30 ALLEN                                            9
       30 ALLEN,BLAKE                                     10
       30 ALLEN,BLAKE,JAMES                               11
       30 ALLEN,BLAKE,JAMES,MARTIN                        12
       30 ALLEN,BLAKE,JAMES,MARTIN,TURNER                 13
       30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD            14

14 rows selected.

Is there any probability that oracle return
       30 MARTIN,TURNER,WARD,ALLEN,BLAKE,JAMES            14
instead of
       30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD            14
because
"STRAGG can be done in parallel. So even though each thread will get its portion of rows sorted 
(which again can be debated), we have no clue in what order threads will finish. Therefore there is 
no guarantee ODCIAggregateMerge will merge thread results in correct (sort-wise) order."

I know, that "the only way to get sorted data....   use order by", but question about order in each 
row.


Followup   April 17, 2009 - 3pm Central time zone:

the data would be fed to the aggregate in order, the rows returned by the query would not be in any order.

So, stragg would see the data "in order of ename" for each piece - and aggregate merge would get two sorted pieces - if it just put them together - the resulting string might not be sorted anymore

ctl-f for

Ok, here is one that does "sorted" aggregates


that one would work, since it SORTS the data itself.
5 stars How to Eliminate in active & Duplicate Names   June 9, 2009 - 2pm Central time zone
Reviewer: Sambi from NJ
Hi tom,

CREATE TABLE TNX_FRM
(
  LOC_ID     INTEGER,
  FORM_CD    VARCHAR2(50 BYTE),
  CHANGE_NO  INTEGER,
  IS_ACTIVE  VARCHAR2(1 BYTE)
);



Insert into TNX_FRM 
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'HOME', 1, 'Y');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'ALARM', 1, 'Y');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'ALARM', 1, 'Y');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'ADJUSTMENT', 1, 'Y');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'CONTRACT', 1, 'Y');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'DEDUCTIBLE', 1, 'Y');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'SAFEGUARD', 1, 'Y');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'SAFEGUARD', 2, 'N');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'ALARM', 2, 'N');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'CONTRACT', 2, 'N');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'CONTRACT', 3, 'Y');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'DEDUCTIBLE', 2, 'N');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'ADJUSTMENT', 2, 'N');
Insert into TNX_FRM
   (LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
 Values
   (55, 'DEDUCTIBLE', 3, 'Y');

COMMIT;



select loc_id,form_cd,change_no,is_active 
from tnx_frm
order by loc_id,change_no,form_cd


LOC_ID    FORM_CD    CHANGE_NO    IS_ACTIVE

55    ADJUSTMENT    1    Y
55    ALARM        1    Y
55    ALARM        1    Y
55    CONTRACT    1    Y
55    DEDUCTIBLE    1    Y
55    HOME        1    Y
55    SAFEGUARD    1    Y
55    ADJUSTMENT    2    N
55    ALARM        2    N
55    CONTRACT    2    N
55    DEDUCTIBLE    2    N
55    SAFEGUARD    2    N
55    CONTRACT    3    Y
55    DEDUCTIBLE    3    Y

14 rows selected.

I want the below output.


LOC_ID    CHANGE_NO    NAMES

55    1    ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55    2    HOME
55    3    HOME,CONTRACT,DEDUCTIBLE

is_active indicates whether the name(form_cd) is active or has been dropped.The names get carried 
over 
to the next change nos unless they are explicitly set in that change_no using is_active='N'

for eg: at change_no 2 all the other names(form_cd) except HOME are no more valid because they are 
explicitly set as 'N'

i can use stragg or or sys_connect_by_path trick for this kind of queries but 

i)  i have to get the distinct codes on loc_id level otherwise i get duplicates in the list 
(eg : for change_no 3 i get HOME,CONTRACT,DEDUCTIBLE,CONTRACT,DEDUCTIBLE.)

ii) when i build the path a particular name may become invalid in the current change but the same 
name may be inherited from 

a previous change.that case i may need to do a INSTR to see if the name is already coming from a 
previous change ..but that 

is not going to work as i have multiple names de-activated in the same change ( eg: change_no 2)


i am using the following to insert into a table 

select loc_id,change_no,(select wm_concat(distinct case when is_active='N' then NULL 
else form_cd end )  
from  misp_temp.tnx_frm i
where i.change_no <= t.change_no
) names
from  misp_temp.tnx_frm t
group by loc_id,change_no
order by change_no

and then loop thru each change to see if there are any actives and remove them..

I don't want to use wm_concat as this is not documented.

I prefer to get this in a query.

Thanks in Advance.


Followup   June 10, 2009 - 8am Central time zone:

... for eg: at change_no 2 all the other names(form_cd) except HOME are no more
valid because they are explicitly set as 'N'
...

does not make sense. what is special about "HOME", it comes before SAFEGUARD in the output above - what made it so darn special

you need to explain this much much better - and it would see to me that people didn't necessarily think about what they were doing with this data when they modelled the schema for it...

pretend you were giving the specifications for this to your mom, be that specific, that precise, using terminology everyone can understand (eg: why would you write: ... at change_no 2 all the other names(form_cd) ... - just to be purposely confusing? "Well, I think of the attribute as a name - but we really call it form_cd, so when I say 'name' you should think 'form_cd')
5 stars   June 10, 2009 - 12pm Central time zone
Reviewer: Sambi from NJ,USA
Tom,

Sorry for the confusion..

I wanted the "names" column in the output sorted in alphabetical order(ascending).

LOC CHANGE  NAMES
_ID _NO
----------------------------------
55 1  ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55 2  HOME
55 3  CONTRACT,DEDUCTIBLE,HOME

This is owned by the ETL team and I Agree with you..it is not a good design.

The "names" column for change_no 2 should contain the comma separated list of distinct active form_cd values from all prior change_nos and the current change_no i.e <=current change_no

A form_cd is active if the most recent value of is_active flag for that form_cd is 'Y' in the change_nos <=current change_no
and it appears in the list.

Case i) For the input like this..

loc_id form_cd  change_no is_active
-------------------------------------
55  ADJUSTMENT  1    Y
55  ALARM      1    Y
55  ALARM      1    Y
55  CONTRACT    1    Y
55  DEDUCTIBLE  1    Y
55  HOME      1    Y
55  SAFEGUARD  1    Y
55  ADJUSTMENT  2    Y
55  ALARM      2    Y
55  CONTRACT    2    Y
55  DEDUCTIBLE  2    Y
55  SAFEGUARD  2    Y
55  CONTRACT    3    Y
55  DEDUCTIBLE  3    Y


The output should be

LOC CHANGE NAMES
_ID _NO
----------------------------------------
55  1 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55  2 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD


Case ii) input

loc_id form_cd  change_no is_active
-------------------------------------
55  ADJUSTMENT  1    Y
55  ALARM      1    Y
55  ALARM      1    Y
55  CONTRACT    1    Y
55  DEDUCTIBLE  1    Y
55  HOME      1    Y
55  SAFEGUARD  1    Y
55  DEDUCTIBLE  2    Y
55  SAFEGUARD  2    Y
55  CONTRACT    3    Y
55  DEDUCTIBLE  3    Y


output:

LOC CHANGE NAMES
_ID _NO
----------------------------------------
55  1 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55  2 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD


Case iii)

For the input like this..

loc_id form_cd  change_no is_active
-------------------------------------
55  ADJUSTMENT  1    Y
55  ALARM      1    Y
55  ALARM      1    Y
55  CONTRACT    1    Y
55  DEDUCTIBLE  1    Y
55  HOME      1    Y
55  SAFEGUARD  1    Y
55  ADJUSTMENT  2    N
55  ALARM      2    N
55  CONTRACT    2    N
55  DEDUCTIBLE  2    N
55  SAFEGUARD  2    N
55  CONTRACT    3    Y
55  DEDUCTIBLE  3    Y


The output should be (for change_no 1 & 2)

LOC CHANGE  NAMES
_ID  _NO
----------------------------------------
55 1  ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55 2  HOME

because in change_no 2 all the other "form_cd"s of pior change_nos except "HOME" are set as inactive using is_active='N'

and change_no 3 the form_cds "CONTRACT" and "DEDUCTIBLE" are made "active" by setting is_active='Y' and so
should contain all the active form_cds of change_nos <=3.

Output (for chagne_no:3 )

LOC_ID  CHANGE_NO  NAMES
----------------------------------------
55      3    CONTRACT,DEDUCTIBLE,HOME

so the complete output will be

LOC CHANGE NAMES
_ID _NO
----------------------------------------
55 1  ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55 2  HOME
55 3  CONTRACT,DEDUCTIBLE,HOME

Hope this is clear. Thanks in advance.


Followup   June 10, 2009 - 4pm Central time zone:

ops$tkyte%ORA10GR2> with
  2  -- we need all of the distinct form_cd values, we'll outer join to this
  3  -- set to "fill in the missing values".  Basically, we need to de-sparsify
  4  -- the data, make sure every loc_id/change_no has all of the form_cd values
  5  distinct_form_cd
  6  as
  7  (
  8  select distinct form_cd
  9    from tnx_frm
 10  ),
 11  -- we also need to get rid of duplicates.  I am making the (reasonable)
 12  -- assumption that if in a given loc_id, change_no, form_cd group - there
 13  -- is any row with a value of "Y", it is the "relevant" one - eg: if in a
 14  -- group there is a "Y" and and "N" row, we'll keep the "Y" row (since Y>N)
 15  distinct_tnx_frm
 16  as
 17  (
 18  select loc_id, change_no, form_cd, max(is_active) is_active
 19    from tnx_frm
 20   group by loc_id, change_no, form_cd
 21  ),
 22  -- now we need to fill in the gaps, the partitioned outer join (10g)
 23  -- does that for us.  We break the tnx_frm data up by loc_id,change_no
 24  -- and for every group we get - we outer join to the distinct form_cds
 25  -- now, every loc_id, change_no has EVERY form_cd.  We use LAST_VALUE
 26  -- then to carry down the last observed value of IS_ACTIVE across all
 27  -- of the groups - so if the first change_no in a loc_id has "Y" and the
 28  -- third change_no in a loc_id has "N" and the fifth has "Y" again - then
 29  -- groups 1, 2 = "Y", 3, 4 = "N", and 5 on up will have "Y"
 30  carry_down_is_active
 31  as
 32  (
 33  select distinct_form_cd.form_cd, distinct_tnx_frm.loc_id, distinct_tnx_frm.change_no,
 34         last_value( distinct_tnx_frm.is_active ignore nulls )
 35              over (partition by loc_id,distinct_form_cd.form_cd order by change_no) isact
 36    from distinct_tnx_frm partition by (loc_id,change_no)
 37            right outer join distinct_form_cd
 38              on (distinct_tnx_frm.form_cd = distinct_form_cd.form_cd)
 39  ),
 40  -- Now we just pull off the active records in each loc_id/change_no pair
 41  -- and then assign a row_number after sorting by form_cd
 42  data
 43  as
 44  (
 45  select loc_id, form_cd, change_no,
 46         row_number() over (partition by loc_id, change_no order by form_cd) rn
 47    from carry_down_is_active
 48         where isact = 'Y'
 49  )
 50  -- so we can string aggregate using a connect by trick...
 51  -- ta-dah, we are done
 52  select loc_id, change_no,
 53         substr( max(sys_connect_by_path( form_cd, ',' )), 2 ) form_cds
 54    from data
 55   start with rn = 1
 56  connect by prior rn+1 = rn and prior loc_id = loc_id and prior change_no = change_no
 57   group by loc_id, change_no
 58   order by 1, 2
 59  /

    LOC_ID  CHANGE_NO FORM_CDS
---------- ---------- ----------------------------------------------------------------------
        55          1 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
        55          2 HOME
        55          3 CONTRACT,DEDUCTIBLE,HOME



5 stars   June 11, 2009 - 10am Central time zone
Reviewer: Sambi from NJ
Thanks tom.That is really great..
This is exactly what i wanted..
Thank you once again..


5 stars row concatenation into a clob   June 30, 2009 - 10am Central time zone
Reviewer: dxl from uk
This thread is very useful for concatenating smallish amounts of column data into a single entry in 
a row... however what technique would you employ if you had to concatenate many thousands of column 
data??

I am trying to come up with a way of taking a column in a table with many thousands of rows and 
concatenating the rows together, maybe using an end of line delimiter between rows, and then 
passing this data into a clob/blob so that a procedure can stream this data back to a jdbc client.

So the basic principle is how to stream the contents of a column in a table back to a client via a 
packaged procedure (probably using clobs or blobs)??

The only way i can think of doing this is processing a batch of rows at a time and appending them 
to the clob out parameter of a procedure so that the jdbc client can stream this data.  Would a 
pipelined function help here to start populating the stream?

Any thoughts on the correct technique would be helpful.


Followup   July 6, 2009 - 6pm Central time zone:

... however what technique would you employ if you
had to concatenate many thousands of column data??
....

I would raise the question seriously "what the heck for - do you really mean to be using XML perhaps in this case, a big comma delimited string of thousands of rows is useless"


... and then passing this data into a clob/blob so
that a procedure can stream this data back to a jdbc client.
....

why??!?!?!? so the jdbc client can parse the string??? why would you not just PASS A RESULT SET BACK TO THEM???? 
5 stars Thanks! Tom - one review more pls   July 21, 2009 - 8am Central time zone
Reviewer: suresh from INDIA
Hi Tom,

I am writing a query to retrieve the shift dates from a table in the following manner.

Expected result :-
Month      Shift-1     Shift-2
Jan2009    01,02,03    7,9,11,23,34
Feb2009    01,02,03    7,9,11,23,34

The query i am using for same
==================================
SELECT to_char(shift_date,'MONYYYY') ,
       LTRIM(MAX( SYS_CONNECT_BY_PATH (to_char(shift_date,'DD'), ',')),',') shift-1
   FROM
   (SELECT  peoplesoft_id,shift_date,shift_codes,
            row_number() OVER ( PARTITION BY to_char(shift_date,'MONYYYY')  ORDER BY shift_date) rn
   FROM  shift_plan where  peoplesoft_id=1234567 )
   CONNECT  BY  to_char(shift_date,'MONYYYY') = PRIOR to_char(shift_date,'MONYYYY')
       AND  rn   = PRIOR rn+1
   START WITH rn =1
   GROUP BY to_char(shift_date,'MONYYYY');

**** But the problem here is where to put the condition for each shift type ****
1) If i put the condition in row over partition statement then i get the result for only for that 
shift i.e shift-1 or something.
2) I have tried to use case statement to filter the shift codes above in ltrim place but the 
condition itself not working
ex:- ltrim(max(case when shift_codes in ('W1','S1','O1') then LTRIM(MAX( SYS_CONNECT_BY_PATH 
(to_char(shift_date,'DD'), ',')),',') end

Please help.

-Thanks
Suresh


Followup   July 24, 2009 - 12pm Central time zone:

no create
no inserts
no example

I don't even bother to read.... no idea if we can answer your question, but unless you make it easy for us to answer it - no one will look at it.
5 stars Thanks! one more review pls   July 25, 2009 - 7am Central time zone
Reviewer: Suresh from INDIA
Hi Tom,

Apologies for not clear enough.

Here is the table.
Table Shift_Codes
=================
Shift_Code  Shift_Descriptio
W        GeneralShift
W1        Shift1
W2        Shift2
W3        Shift3
...
and so on upto 30 shiftcodes.


Shift_Plan
============
Psoftid  Date    ShiftCode
1234567  01-Jan-09  W1
1234567  02-Jan-09  W1
1234567  03-Jan-09  W2
...
1234567  26-Jan-09  H
1234567  01-Feb-09  W1
1234567  02-Feb-09  W2
1234567  06-Feb-09  W3
..

I am generating a report of shift wise dates like below by selecting the psoftid Employee can view his shifts

Month Shift-1 Shift-2 Shift-3 Holidays Weekends
Jan-09 01,02  03    -    26    14,15,21,22
Feb-09 01,    02    06    -    14,15,21,22


For this i have written the below query and this works well upto shift-1
=======================================
SELECT to_char(shift_date,'MONYYYY') ,
LTRIM(MAX( SYS_CONNECT_BY_PATH (to_char(shift_date,'DD'), ',')),',') shift-1
  FROM
  (SELECT peoplesoft_id,shift_date,shift_codes,
        row_number() OVER ( PARTITION BY to_char(shift_date,'MONYYYY') ORDER BY shift_date) rn
  FROM shift_plan where peoplesoft_id=1234567 and shift_codes in ('W1')
  CONNECT BY to_char(shift_date,'MONYYYY') = PRIOR to_char(shift_date,'MONYYYY')
    AND rn  = PRIOR rn+1
  START WITH rn =1
  GROUP BY to_char(shift_date,'MONYYYY');
Output
======
Month Shift-1
Jan-09 01,02 
Feb-09 01,   

But how to put condition for others columns shift-2,shift-3,Holidays,Weekends in my above query.


I tried to keep the condition like below.

SELECT to_char(shift_date,'MONYYYY') ,
LTRIM(MAX( case when shift_codes in ('W1') then SYS_CONNECT_BY_PATH (to_char(shift_date,'DD'), ','))end as "Shift-1",',') FROM
  (SELECT peoplesoft_id,shift_date,shift_codes,
        row_number() OVER ( PARTITION BY to_char(shift_date,'MONYYYY') ORDER BY shift_date) rn
  FROM shift_plan where peoplesoft_id=1234567 )
  CONNECT BY to_char(shift_date,'MONYYYY') = PRIOR to_char(shift_date,'MONYYYY')
    AND rn  = PRIOR rn+1
  START WITH rn =1
  GROUP BY to_char(shift_date,'MONYYYY');

But the i got similar output, means case not working
Month Shift-1
Jan-09 01,02,03,...26,14,15,21,22..31
Feb-09 01,02,..06...14,15,21,22...28
....


Hope this time I am clear enough. Please advise.

-Thanks
Suresh




Followup   July 26, 2009 - 7am Central time zone:

no CREATE
no INSERTS
no look
never
not then
not now
not tomorrow


I honestly have no idea if your question can be answered - but only because I did not read it, and will not read it unless and until it includes a CREATE TABLE statement I CAN RUN, and insert into statements that I CAN RUN.

You know, sort of like I always give you guys...
4 stars STRAGG with "distinct"   September 24, 2009 - 9am Central time zone
Reviewer: Joe from NY
Tom,

I am using the STRAGG function to concatenate a column from multiple rows but I only want the distinct values. So I did a "STRAGG(distinct col1)" and this works great. However, when I put this SQL in my procedure it fails to compile with "PL/SQL: ORA-30482: DISTINCT option not allowed for this function".

Can you tell me why this works when I run the sql by itself and fails whenever I put it in a procedure or BEGIN/END block? Is there a way to fix it? I guess I could first select the distinct values in a sub-query and then use stragg but was wondering if there is another way.

Below is a sample you can run in a command window that show that it works by itself and fails inside a begin/end block.


create table t (col1 varchar2(20));

-- This inserts works here
insert into t
  select stragg(distinct dummy)
    from dual;


-- When inside a procedure or begin/end block, the insert fails
begin
  insert into t
    select stragg(distinct dummy)
      from dual;
end;
/


Thanks,
Joe (Oracle 10g)

Followup   September 29, 2009 - 8am Central time zone:

I filed a bug on this as it reproduces in 9i-11gr2.  It should work

You can either

a) use distinct AND THEN stragg as you said...
b) use dynamic sql, hide the construct from plsql
c) use a view, hide the construct from plsql

Write a Review
 


All information and materials provided here are provided "as-is"; Oracle disclaims all express and implied warranties, including, the implied warranties of merchantability or fitness for a particular use. Oracle shall not be liable for any damages, including, direct, indirect, incidental, special or consequential damages for loss of profits, revenue, data or data use, incurred by you or any third party in connection with the use of this information or these materials.

About Oracle | Legal Notices and Terms of Use | Privacy Statement