Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, Philip.

Asked: February 14, 2004 - 9:08 pm UTC

Last updated: November 16, 2011 - 9:16 am UTC

Version: 9.0.2

Viewed 50K+ times! This question is

You Asked

Hi Tom,

Need your help here.

Stage table:

Customer Product Order Date Operator
-------- ------------------ ----------
C1 P1 D1 OP1
C1 P2 D2 OP2
C1 P3 D3 OP3
C1 P1 D4 OP3

C2 P2 D1 OP2
C2 P3 D2 OP4

C3 P1 D2 OP1
C3 P1 D3 OP1

I would like the results to look like the following (ordered by product name):

Output table:

Customer Product_Str Order_Date_Str Operator_Str
-------- ----------- -------------- ------------
C1 P1,P1,P2,P3 D1,D4,D2,D3 OP1,OP3,OP2,OP4
C2 P2,P3 D1,D2 OP2,OP4
C3 P1,P1 D2,D3 OP1,OP1

There are about 14 millions customers and the number of orders per customer are infinite.

I was able to use STRAGG function to build the string fields fast (less than an hour). But the product name is not in order and these three fileds are out of sync. I know “Set” has no order concept.

Is there any other way to construct this? I have tried to use a function but it’s way too slow. I have also tried to use analytic function row_number() over… but since the order number is not known…


Create table stage
(customer varchar(2),
product varchar(2),
order_date varhcar2(2),
operator varchar2(9));

insert into stage values (‘C1’,’P1’,’D1’,’OP1’);
insert into stage values (‘C1’,’P2’,’D2’,’OP2’);
insert into stage values (‘C1’,’P3’,’D3’,’OP3’);
insert into stage values (‘C1’,’P1’,’D4’,’OP3’);
insert into stage values (‘C2’,’P2’,’D1’,’OP2’);
insert into stage values (‘C2’,’P3’,’D2’,’OP4’);
insert into stage values (‘C3’,’P1’,’D2’,’OP1’);
insert into stage values (‘C3’,’P1’,’D3’,’OP1’);

commit;

create table output
(customer varchar2(2),
product_str varchar2(100),
order_date_str varchar2(100),
operator_str varchar2(100));

SQL>Insert into output
Select customer, stragg(product) ,stragg(order_date), stragg(operator)
From stage
Group by customer;

Thank you!

=================

Tom,

On average the products ordered by a customer is 18. But it could be as high as 40 and as low as 1.

a) the three fields need to be correlated -- eg: sort order_date and operator
by product and string them in that order.

==> The three fields need to be correlated


b) are the three fields "unique" in the customer. eg: will the same product appear over and over

==> The product can be repeated over and over again. So can order date and operator ID.

Thanks!








and Tom said...

I would use row_number() and pivot then. It will be faster then modifying stragg to procedurally "sort" the correlated data with duplicates.


ops$tkyte@ORA920PC> declare
2 l_stmt long;
3 begin
4 l_stmt := 'insert into output select customer, rtrim( max(decode(rn,1,product)) ';
5 for i in 2 .. 40
6 loop
7 l_stmt := l_stmt || ' || '','' || max(decode( rn, ' || i || ', product ))';
8 end loop;
9 l_stmt := l_stmt || ', '','' ), rtrim( max(decode(rn,1,order_date)) ';
10 for i in 2 .. 40
11 loop
12 l_stmt := l_stmt || ' || '','' || max( decode( rn, ' || i || ', order_date ))';
13 end loop;
14 l_stmt := l_stmt || ', '','' ), rtrim( max(decode(rn,1,operator)) ';
15 for i in 2 .. 40
16 loop
17 l_stmt := l_stmt || ' || '','' || max( decode( rn, ' || i || ', operator ))';
18 end loop;
19 l_stmt := l_stmt || ', '','' ) from (select stage.*, row_number() over (partition by customer order by rowid) rn from stage)';
20 l_stmt := l_stmt || ' group by customer';
21
22 execute immediate l_stmt;
23 end;
24 /

PL/SQL procedure successfully completed.

ops$tkyte@ORA920PC> select * from output;

CU PRODUCT_STR
-- ----------------------------------------------------------------------------------------------------
ORDER_DATE_STR
----------------------------------------------------------------------------------------------------
OPERATOR_STR
----------------------------------------------------------------------------------------------------
C1 P1,P2,P3,P1
D1,D2,D3,D4
OP1,OP2,OP3,OP3

C2 P2,P3
D1,D2
OP2,OP4

C3 P1,P1
D2,D3
OP1,OP1


If you are worried about "having 41" someday, just add:


l_stmt := l_stmt || ' where decode( rn, 41, 1/0 ) is null group by customer';


that way, the statement will FAIL if there are 41 -- letting you know -- so you can increase the size of the "loop"



Rating

  (90 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Product String field is not sorted

Philip Kuan, February 16, 2004 - 9:46 am UTC

Tom,

Sorry if I did not make myself clear. The product_str field needs to be sorted by product name.

So the output for C1 should be P1,P1,P2,P3. And sort order_date and operator by product and string them in that order.

Thank you.


Tom Kyte
February 16, 2004 - 10:04 am UTC

sorry -- order by product_name, rowid


(partition by customer order by PRODUCT_NAME, rowid)




Excellent

Philip Kuan, February 16, 2004 - 8:52 pm UTC

Tom,

The solution works really well. Thank you so much!

Good

James, March 14, 2004 - 10:57 pm UTC

Dear Sir,
How to create our own operator using the command
sql> create operator op1 ..
Do you have a demo for this?
Please do reply.


Tom Kyte
March 15, 2004 - 7:18 am UTC

No, i do not -- but the documentation does:

</code> http://docs.oracle.com/docs/cd/B10501_01/appdev.920/a96595/toc.htm <code>



user defined analytic function

Nopparat V., March 18, 2004 - 7:04 am UTC

It will be a good idea if we can create a user defined analytic function in the same way we did with a user defined aggregate function. (Oracle should consider its possibility)

I have modified your StrAgg function to use in case sorting is required.

create or replace type stringOrder as object (
TheString varchar2(10) ,
TheOrder number(2) ) ;
/
create or replace type StringAggType2 as object
(
theString varchar2(4000),
MaxOrder number(2),
static function
ODCIAggregateInitialize(sctx IN OUT StringAggType2 )
return number,
member function
ODCIAggregateIterate(self IN OUT StringAggType2 ,
value IN StringOrder )
return number,
member function
ODCIAggregateTerminate(self IN StringAggType2,
returnValue OUT varchar2,
flags IN number)
return number,
member function
ODCIAggregateMerge(self IN OUT StringAggType2,
ctx2 IN StringAggType2)
return number
);
/
create or replace type body StringAggType2
is
static function ODCIAggregateInitialize(
sctx IN OUT StringAggType2)
return number
is
begin
sctx := StringAggType2( null,0 );
return ODCIConst.Success;
end;
member function ODCIAggregateIterate(
self IN OUT StringAggType2 ,
value IN StringOrder )
return number
is
begin
if value.TheOrder > self.MaxOrder then
self.theString := self.theString ||
lpad(',',value.TheOrder-self.MaxOrder,',') ||
value.TheString ;
self.MaxOrder := value.TheOrder ;
elsif value.TheOrder < self.MaxOrder then
self.theString := substr(self.theString,1,instr(self.theString,',',1,value.TheOrder))||
value.TheString ||
substr(self.theString,instr(self.theString,',',1,value.TheOrder)+1) ;
end if ;
return ODCIConst.Success;
end;
member function ODCIAggregateTerminate(
self IN StringAggType2,
returnValue OUT varchar2,
flags IN number)
return number
is
begin
returnValue := rtrim( ltrim( self.theString, ',' ), ',' );
return ODCIConst.Success;
end;
member function ODCIAggregateMerge(
self IN OUT StringAggType2,
ctx2 IN StringAggType2)
return number
is
begin
self.theString := self.theString || ',' || ctx2.theString;
self.MaxOrder := self.MaxOrder + ctx2.MaxOrder ;
return ODCIConst.Success;
end;
end ;
/
CREATE OR REPLACE FUNCTION StrAggOrd(input StringOrder)
RETURN varchar2
PARALLEL_ENABLE AGGREGATE USING StringAggType2;
/

Then we can use it by

select deptno, StrAggOrd(StringOrder(ename,rn)) member
from
(select deptno, ename,
row_number () over (partition by deptno order by ename) rn
from emp)
group by deptno ;

How about this solution ?

Tom Kyte
March 18, 2004 - 8:01 am UTC

what do you mean?  user defined aggregates are documented to be "analytic friendly"

Just take my original stragg unchanged:

ops$tkyte@ORA9IR2> l
  1  select deptno, stragg(ename) over (partition by deptno order by ename) ename
  2* from emp
ops$tkyte@ORA9IR2> /
 
    DEPTNO ENAME
---------- ----------------------------------------
        10 CLARK
        10 CLARK,KING
        10 CLARK,KING,MILLER
        20 ADAMS
        20 ADAMS,FORD
        20 ADAMS,FORD,JONES
        20 ADAMS,FORD,JONES,SCOTT
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN
        30 ALLEN,BLAKE
        30 ALLEN,BLAKE,JAMES
        30 ALLEN,BLAKE,JAMES,MARTIN
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 
14 rows selected.
 
 

exactly what I was looking for yesterday

bob, March 18, 2004 - 10:05 am UTC

Your holding out on me Tom. :)

Yesterday when I asked:

"Using StringAgg, how can I order the concatenation so that it occurs alphabetically? "

You gave me:

"You'd have to "save up" the inputs and then sort them and then concatenate them.
they come in random order to stragg... "

Today, you give Philip the key with the use of analytics.
I guess I just didn't know how to ask my question correctly, but thankfully philp was trying to do the same exact thing I was.

This site is great!


Tom Kyte
March 18, 2004 - 10:21 am UTC

but analytics don't really do that -- for if you look at the results -- you didn't get just an alphabetic listing of enames by deptno, you got a record per employee!

So, in response to:

<quote>
It will be a good idea if we can create a user defined analytic function in the 
same way we did with a user defined aggregate function. (Oracle should consider 
its possibility)
</quote>

the answer is "well, we do that".  But if you want stragg to work as an AGGREGATE with sorted data -- the answer is "aggregates don't do that"


Ok, here is one that does "sorted" aggregates:

<b>
ops$tkyte@ORA9IR2> create or replace type vcArray as table of varchar2(4000)
  2  /
Type created.
</b>

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace type string_agg_type as object
  2  (<b>
  3     data  vcArray,</b>
  4
  5     static function
  6          ODCIAggregateInitialize(sctx IN OUT string_agg_type )
  7          return number,
  8
  9     member function
 10          ODCIAggregateIterate(self IN OUT string_agg_type ,
 11                               value IN varchar2 )
 12          return number,
 13
 14     member function
 15          ODCIAggregateTerminate(self IN string_agg_type,
 16                                 returnValue OUT  varchar2,
 17                                 flags IN number)
 18          return number,
 19
 20     member function
 21          ODCIAggregateMerge(self IN OUT string_agg_type,
 22                             ctx2 IN string_agg_type)
 23          return number
 24  );
 25  /
 
Type created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace type body string_agg_type
  2  is
  3
  4  static function ODCIAggregateInitialize(sctx IN OUT string_agg_type)
  5  return number
  6  is
  7  begin<b>
  8      sctx := string_agg_type( vcArray() );</b>
  9      return ODCIConst.Success;
 10  end;
 11
 12  member function ODCIAggregateIterate(self IN OUT string_agg_type,
 13                                       value IN varchar2 )
 14  return number
 15  is
 16  begin<b>
 17      data.extend;
 18      data(data.count) := value;</b>
 19      return ODCIConst.Success;
 20  end;
 21
 22  member function ODCIAggregateTerminate(self IN string_agg_type,
 23                                         returnValue OUT varchar2,
 24                                         flags IN number)
 25  return number
 26  is<b>
 27      l_data varchar2(4000);
 28  begin
 29      for x in ( select column_value from TABLE(data) order by 1 )
 30      loop
 31              l_data := l_data || ',' || x.column_value;
 32      end loop;
 33      returnValue := ltrim(l_data,',');</b>
 34      return ODCIConst.Success;
 35  end;
 36
 37  member function ODCIAggregateMerge(self IN OUT string_agg_type,
 38                                     ctx2 IN string_agg_type)
 39  return number
 40  is
 41  begin<b> -- not really tested ;)
 42      for i in 1 .. ctx2.data.count
 43      loop
 44              data.extend;
 45              data(data.count) := ctx2.data(i);
 46      end loop;</b>
 47      return ODCIConst.Success;
 48  end;
 49
 50
 51  end;
 52  /
 
Type body created.
 
ops$tkyte@ORA9IR2> show err
No errors.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> CREATE or replace
  2  FUNCTION stragg(input varchar2 )
  3  RETURN varchar2
  4  PARALLEL_ENABLE AGGREGATE USING string_agg_type;
  5  /
 
Function created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> column ename format a40
ops$tkyte@ORA9IR2> select deptno, stragg(ename) ename
  2    from emp
  3   group by deptno
  4  /
 
    DEPTNO ENAME
---------- ----------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 


You could take the above other example and apply the same technique (the row_number() concept is intriguing as it allows you to specify some other sorted order) 

same result

bob, March 18, 2004 - 12:56 pm UTC

Hi Tom,

I had just wrapped that earlier result (one row per employee) in another analytic to grab the result from the last employee that contained the full list for the dept, so ultimately, the initial use of analytics which returned a result for each employee led me to use another one, to get what I was looking for without rewriting the user defined aggregate like you did.

I am sure this is far from ideal, but it worked initially.

Thank you

Nopparat V., March 18, 2004 - 11:16 pm UTC

I'm sorry that I didn't finish the whole chapter of User-Defined Aggregate Functions really. You're right. It all was said in the book.

Also I agree with Bob. If we can use original StringAgg with some analytic options in order to complete our requirement, it should be the better choice. I think the original one can be used in more general purpose than the other modified ones.

Tom Kyte
March 19, 2004 - 8:15 am UTC

but analytics are fundementally *not* aggregates.

Think about how many times the analytic function is invoked vs how many times the aggregate is.....

More Questions

Nopparat V., March 19, 2004 - 12:12 am UTC

I have found from the book that there is another interface ODCIAggregateDelete to use in case of analytic function. I, however, can't understand it well. Could you explain me more how it work? or Just show me some example to implement.

Also please advise whether I should add this interface into the original StrAggType. How it affect the performance when we use it in the simple aggregate action?

Tom Kyte
March 19, 2004 - 8:43 am UTC

it is an optimization (and a good question -- at first I was going to leave this as an exercise for the reader, but I got curious myself and proved that "every day you can learn something new about Oracle", this was my new thing today :)

http://docs.oracle.com/docs/cd/B10501_01/appdev.920/a96595/dci11agg.htm#1004794 http://docs.oracle.com/docs/cd/B10501_01/appdev.920/a96595/dci19agg.htm#76648

It would be most useful with sliding windows, or non "top anchored" windows.

In the above example -- the stragg result was always "growing" or just "reset".  it would never delete.  If I had a SLIDING window however -- it would be different.  Consider a sliding window of the "preceding, current and next" row.  stragg would be gluing three rows together and slide through the result set.  Well, there are two ways to do this -- with and without the delete.

In order to see the difference, we'll create a small package that is useful for collecting your own custom statistics in your session:

ops$tkyte@ORA9IR2> create or replace context stats_ctx using stats_pkg
  2  /
 
Context created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace package stats_pkg
  2  as
  3      procedure reset;
  4      procedure inc( p_name in varchar2, p_amt in number default 1 );
  5      procedure show( p_reset in boolean default false );
  6  end;
  7  /
 
Package created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace package body stats_pkg
  2  as
  3  procedure reset
  4  is
  5  begin
  6      for x in ( select attribute
  7                   from session_context
  8                  where namespace = 'STATS_CTX' )
  9      loop
 10          dbms_session.set_context( 'stats_ctx', x.attribute, 0 );
 11      end loop;
 12  end;
 13
 14
 15  procedure inc( p_name in varchar2, p_amt in number default 1 )
 16  is
 17  begin
 18      dbms_session.set_context( 'stats_ctx', p_name, 
              nvl(sys_context('stats_ctx',p_name),0)+p_amt );
 19  end;
 20
 21  procedure show( p_reset in boolean default false )
 22  is
 23  begin
 24      for x in ( select attribute, value
 25                   from session_context
 26                  where namespace = 'STATS_CTX'
 27                  order by attribute )
 28      loop
 29          dbms_output.put_line( rpad( x.attribute, 32 ) || x.value );
 30          if ( p_reset )
 31          then
 32              dbms_session.set_context( 'stats_ctx', x.attribute, 0 );
 33          end if;
 34      end loop;
 35  end;
 36
 37  end;
 38  /
 
Package body created.


Now, I instrumented the original stragg with:

...
  4  static function ODCIAggregateInitialize(sctx IN OUT string_agg_type)
  5  return number
  6  is
  7  begin
  8      sctx := string_agg_type( null );<b>
  9      stats_pkg.inc( 'ODCIAggregateInitialize' );</b>
 10      return ODCIConst.Success;
 11  end;
......

in each and every entry point.  Now, we run some queries:


ops$tkyte@ORA9IR2> column enames format a50
ops$tkyte@ORA9IR2> select deptno, stragg(ename) over (order by ename rows between 1 preceding and 1 following ) enames
  2    from emp
  3  /
 
    DEPTNO ENAMES
---------- --------------------------------------------------
        20 ADAMS,ALLEN
        30 ADAMS,ALLEN,BLAKE
        30 ALLEN,BLAKE,CLARK
        10 BLAKE,CLARK,FORD
...
        30 SMITH,TURNER,WARD
        30 TURNER,WARD
 
14 rows selected.
 
ops$tkyte@ORA9IR2> exec stats_pkg.show( true );
ODCIAGGREGATEINITIALIZE         13
ODCIAGGREGATEITERATE            38
ODCIAGGREGATETERMINATE          15
 
PL/SQL procedure successfully completed.

<b>as you can see -- that iterate routine was called LOTS as we slid through.  that is because each three row "window" was constructed one by one.  Now, look at a "growing only" example:</b>


ops$tkyte@ORA9IR2> select deptno, stragg(ename) over (partition by deptno order by ename) enames
  2    from emp
  3  /
 
    DEPTNO ENAMES
---------- --------------------------------------------------
        10 CLARK
        10 CLARK,KING
        10 CLARK,KING,MILLER
        20 ADAMS
        20 ADAMS,FORD
...
        30 ALLEN,BLAKE,JAMES,MARTIN
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 
14 rows selected.
 
ops$tkyte@ORA9IR2> exec stats_pkg.show( true );
ODCIAGGREGATEINITIALIZE         3
ODCIAGGREGATEITERATE            14
ODCIAGGREGATETERMINATE          15
 
PL/SQL procedure successfully completed.

<b>iterate was called lots less (as was initialize -- once per partition)

Lets add this to the spec/body:</b>

ops$tkyte@ORA9IR2> create or replace type string_agg_type as object
  2  (
....
 20     member function
 21          ODCIAggregateDelete(self IN OUT string_agg_type,
 22                                 value IN varchar2 )
 23          return number,
...

ops$tkyte@ORA9IR2> create or replace type body string_agg_type
  2  is
  3
...
 23  member function ODCIAggregateDelete(self IN OUT string_agg_type,
 24                                      value IN varchar2 )
 25  return number
 26  is
 27  begin
 28      self.total := replace( self.total, ',' || value, '' );
 29      stats_pkg.inc( 'ODCIAggregateDelete' );
 30      return ODCIConst.Success;
 31  end;
......

<b>NOTE: warning Will Robinson -- danger danger....  I took a totally cheesy cheat here.  A simple replace.  This only works ONLY works because ename is UNIQUE (and contains no commas!).  Our code would have to obviously be more sophisticated for this to really work in general!!!!!

Now lets rerun that query:</b>


ops$tkyte@ORA9IR2> select deptno, stragg(ename) over (order by ename rows between 1 preceding and 1 following ) enames
  2    from emp
  3  /
 
    DEPTNO ENAMES
---------- --------------------------------------------------
        20 ADAMS,ALLEN
        30 ADAMS,ALLEN,BLAKE
        30 ALLEN,BLAKE,CLARK
        10 BLAKE,CLARK,FORD
...
        20 SCOTT,SMITH,TURNER
        30 SMITH,TURNER,WARD
        30 TURNER,WARD
 
14 rows selected.
 
ops$tkyte@ORA9IR2> exec stats_pkg.show( true );
ODCIAGGREGATEDELETE             12
ODCIAGGREGATEINITIALIZE         1
ODCIAGGREGATEITERATE            14
ODCIAGGREGATETERMINATE          15


<b>42 calls, vs 66 calls before -- and if the delete was "efficient", perhaps significantly reduced resource usage.

Note however that a growing example:</b>


ops$tkyte@ORA9IR2> select deptno, stragg(ename) over (partition by deptno order by ename) enames
  2    from emp
  3  /
 
    DEPTNO ENAMES
---------- --------------------------------------------------
....
        30 ALLEN,BLAKE,JAMES
        30 ALLEN,BLAKE,JAMES,MARTIN
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 
14 rows selected.
 
ops$tkyte@ORA9IR2> exec stats_pkg.show( true );
ODCIAGGREGATEDELETE             0
ODCIAGGREGATEINITIALIZE         3
ODCIAGGREGATEITERATE            14
ODCIAGGREGATETERMINATE          15
 
PL/SQL procedure successfully completed.

<b>won't be affected...</b>
 

Thank you

Nopparat V., March 20, 2004 - 3:57 am UTC

Thank you very much. I'll try its effects on normal aggregrate by myself. But I think it probably result in the same manner as the last one. Then it mean we can add, and probably should add ,the delete function if we have any efficient one with no negative effects.

I strongly agree with you that "everyday you can learn something new about Oracle", especially from this website.

Another Curiousity

Nopparat V., March 20, 2004 - 4:39 am UTC

I expect the usage count of OCIAggregateTerminate of all sample to be 14 (equal to 14 rows of result). Why it become 15? When was it invoked for the additional one?

Tom Kyte
March 20, 2004 - 10:04 am UTC

you fetch 15 times for 14 rows, guess they are calling it regardless.

A reader, May 26, 2004 - 4:10 am UTC


stragg ... sometimes

New reader, July 02, 2004 - 1:59 pm UTC

STRAGG is great. I use it all the time. However I have to average a set of results only when all of the results in the set have a null operator. If any of the results have a value for the operator (<,>,etc) then all of the results should be STRAGG'd. As you can see below, because the set where assay_id = 2 has both null and non-null operators, I'm getting both the average and the STRAGG returned. Any suggestions to remove unwanted row?

CREATE TABLE t (
alt_id VARCHAR2(30),
lot_id VARCHAR2(30),
OPERATOR VARCHAR2(5),
result NUMBER,
assay_id NUMBER,
result_type NUMBER);

INSERT INTO t SELECT 'CorpID 1','1',NULL,10,1,10 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1',NULL,14,1,10 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1',NULL,15,2,4 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1',NULL,15,2,4 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1','>',50,2,4 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1','<',8,3,7 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1','<',10,3,7 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1','<',50,4,12 FROM dual;
INSERT INTO t SELECT 'CorpID 1','1','>',35,4,12 FROM dual;


CREATE OR REPLACE FUNCTION stragg (
pCol IN VARCHAR2,
pTbl IN VARCHAR2,
pWhere IN LONG,
pOrder IN VARCHAR2,
pSeparator IN VARCHAR2 DEFAULT ', ') RETURN VARCHAR2
AS
TYPE cur IS REF CURSOR;
myCur cur;
sAggValue VARCHAR2(4000);
sSep VARCHAR2(10);
sAggregate VARCHAR2(4000);
BEGIN
OPEN myCur FOR
'SELECT ' || pCol || ' ' || 'FROM ' || pTbl || ' ' || pWhere || ' ' || pOrder;

LOOP
FETCH myCur into sAggValue;
EXIT WHEN myCur%NOTFOUND;

sAggregate := sAggregate || sSep || sAggValue;
sSep := pSeparator;
END LOOP;
CLOSE myCur;

RETURN sAggregate;
END stragg;


SELECT DISTINCT alt_id,
lot_id,
CASE WHEN operator IS NULL THEN To_Char(Avg(result))
ELSE stragg('operator||result', 't', 'WHERE alt_id=''' || alt_id || ''' and lot_id=''' || lot_id || ''' and assay_id=' || assay_id || ' and result_type=' || result_type, 'ORDER BY result')
END AS result,
assay_id,
result_type
FROM t
GROUP BY alt_id, lot_id, operator, assay_id, result_type
ORDER BY alt_id, lot_id, assay_id, result_type;


alt_id lot_id result assay_id result_type
-------- ------ ------ -------- -----------
CorpID 1 1 12 1 10
CorpID 1 1 15 2 4 *unwanted
CorpID 1 1 15,15,>50 2 4
CorpID 1 1 <8,<10 3 7
CorpID 1 1 >35,<50 4 12

Tom Kyte
July 02, 2004 - 2:39 pm UTC

don't get it.

stragg ... sometimes

New Reader, July 02, 2004 - 2:54 pm UTC

The following statement is returning the average of the 2 results of 15 (assay_id=2) and the STRAGG'd list of results for all assay_id=2 (15,15,>50). For any group of results where one or more of the operators is not null (i.e. <,>,etc where assay_id=2 in this case), it should return just the STRAGG, not the average. So I should not get the second row because the two results of 15 are being reported in the third row.

SELECT DISTINCT alt_id,
lot_id,
CASE WHEN operator IS NULL THEN To_Char(Avg(result))
ELSE stragg('operator||result', 't', 'WHERE alt_id=''' || alt_id || ''' and lot_id=''' || lot_id || ''' and assay_id=' || assay_id || ' and result_type=' || result_type, 'ORDER BY result')
END AS result,
assay_id,
result_type
FROM t
GROUP BY alt_id, lot_id, operator, assay_id, result_type
ORDER BY alt_id, lot_id, assay_id, result_type;


alt_id lot_id result assay_id result_type
-------- ------ ------ -------- -----------
CorpID 1 1 12 1 10
CorpID 1 1 15 2 4 *unwanted
CorpID 1 1 15,15,>50 2 4
CorpID 1 1 <8,<10 3 7
CorpID 1 1 >35,<50 4 12

Tom Kyte
July 02, 2004 - 4:38 pm UTC

so, you want to process rows such that either

a) all of the operators are NULL in a given "group" - alt_id/lot_id being the definition of a group

b) only do the rows in a group where the operations are not null, if there is at least one not null operator

Use this in place of "T" in your query:


(
select *
from (
select t.*, max(operator) over (partition by alt_id, lot_id) max_oper
from t
)
where max_oper is null OR operator is not null
)


that'll find if there is a non-null operator in your group -- and keep only rows you want.

Packaged function

A reader, August 19, 2004 - 2:27 pm UTC

Can stragg be a packaged function instead of a standalone function?

Seems not, why is this?

Thanks

Tom Kyte
August 19, 2004 - 7:50 pm UTC

you need a standalone function, it is just what it is.

ops$tkyte@ORA9IR2> CREATE or replace
  2  FUNCTION stragg(input varchar2 )
  3  RETURN varchar2
  4  PARALLEL_ENABLE AGGREGATE USING string_agg_type;
  5  /

stragg is *just a name*, it is mapped to a type - the type is just like a package.

No additional functionality could be gained by a package here. 

User-defined aggregates vs. builtin aggregates

A reader, August 27, 2004 - 9:09 am UTC

Just curious...if I write my own version of some of the builtin aggregates like sum, min, max, etc, would that perform the same as builtins?

In other words, are user-defined aggregates second-class citizens or does Oracle internally apply the same optimizations it does to both?

Thanks

Tom Kyte
August 27, 2004 - 10:07 am UTC

builtin functions will almost always be more performant than user written code.

nothing about being a "second class citizen"

everything about "compiled C code generally runs faster than pcode from plsql"

User-defined aggregates vs. builtin aggregates

A reader, August 27, 2004 - 10:38 am UTC

"everything about compiled C code generally runs faster than pcode from plsql"

So would native PL/SQL compilation help here? What if I natively compile my user-defined aggragate function, that would create native compiled C code (dll or .so whatever) and then they should perform the same, right?

Tom Kyte
August 27, 2004 - 10:46 am UTC

native compilation can help in limited circumstances. you still go through the plsql runtime engine.

collect

Laurent Schneider, September 13, 2004 - 6:59 am UTC

pseudo string aggregate in 10g with COLLECT


SQL> create or replace type tov as table of varchar2(255);
/

Type created.

SQL> select deptno, cast(collect(ename) as tov) STRAGG
from emp group by deptno;

    DEPTNO STRAGG
---------- ---------------------------------------------------------------------------------------------------
        10 TOV('CLARK', 'KING', 'MILLER')
        20 TOV('SMITH', 'JONES', 'SCOTT', 'ADAMS', 'FORD')
        30 TOV('ALLEN', 'WARD', 'MARTIN', 'BLAKE', 'TURNER', 'JAMES')



Regards
Laurent
 

Locate STRAGG?

A reader, November 30, 2004 - 8:51 am UTC

Earlier, you mention
>> Just take my original stragg unchanged:

I'm trying to locate the code, but cannot find it.
Coul you add a link on this page? </code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:15637744429336, <code>


Tom Kyte
November 30, 2004 - 9:14 am UTC

when i plugged

stragg

into the search, the very first hit had it?


</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:2196162600402 <code>

Sorted aggregates

A reader, June 13, 2005 - 11:06 am UTC

This thread has aged a little...

In 9iR2, whats the latest and greatest way to get sorted aggregates using stragg? Tom presented a solution, some other reviewer had a suggestion, which one is the one to use?

Ideally, it should be as simple as adding another parameter to stragg i.e. stragg(<expression>,'asc|desc'). Why cant it that simple?

Thanks

Tom Kyte
June 13, 2005 - 12:02 pm UTC

because aggregates are sets, the concept of "sorting" does not apply to them at all.  


stragg with analytics works
or
stashing the data in your own state and stringing it up at the end works


ops$tkyte@ORA9IR2> select *
  2    from (
  3  select deptno,
  4         case when row_number() over (partition by deptno order by ename)=1
  5             then stragg(ename) over
  6                  (partition by deptno
  7                       order by ename
  8                         rows between unbounded preceding
  9                                  and unbounded following)
 10         end enames
 11    from emp
 12         )
 13   where enames is not null
 14  /
 
    DEPTNO ENAMES
---------- ------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 

Re: Sorted aggregates

Laurent Schneider, June 14, 2005 - 8:14 am UTC

For Ascending order, you maybe can use "distinct" to simulate the sort.

select deptno, stragg(distinct ename) ename
from emp group by deptno;

    DEPTNO ENAME
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD




for desc order, you could use analytics

select distinct deptno, stragg(ename) over (partition by deptno order by ename desc rows between unbounded preceding and unbounded following) ename
from emp;

    DEPTNO ENAME
---------- --------------------------------------------------
        10 MILLER,KING,CLARK
        20 SMITH,SCOTT,JONES,FORD,ADAMS
        30 WARD,TURNER,MARTIN,JAMES,BLAKE,ALLEN

or a workaround like 

SQL> select deptno, translate(stragg(distinct translate(ename, 'ABCDEFGHIJKLMNOQRSTUVWXYZ', 'ZYXWVUTSRQONMLKJIHGFEDCBA')), 'ABCDEFGHIJKLMNOQRSTUVWXYZ', 'ZYXWVUTSRQONMLKJIHGFEDCBA') ename  from emp group by deptno;

    DEPTNO ENAME
---------- --------------------------------------------------
        10 MILLER,KING,CLARK
        20 SMITH,SCOTT,JONES,FORD,ADAMS
        30 WARD,TURNER,MARTIN,JAMES,BLAKE,ALLEN


I understand "a reader" confusion, because most aggregate functions use commutative properties like + and *.

Ex: SUM(1,2,3) = SUM(3,2,1)
AVG(1,10,100) = AVG(100, 10, 1)

The stragg UDAG is not deterministic. It is probably best not to use it in a MView or in a Function Based Index.

Kind regards
 

Tom Kyte
June 14, 2005 - 10:05 am UTC

don't -- that is like assuming GROUP BY sorts, it might, it might not. distinct might sort, it might not.

RE: that is like assuming GROUP BY sorts

Laurent Schneider, June 20, 2005 - 8:37 am UTC

Thanks for the info. It is fairly surprising! Nice to know !

Here is a proof that group by does not necessarly sort the way I expected :

SQL> create index desc_job_emp on emp(job desc); 

Index created.

SQL> select job from emp where job < 'Q' group by job;
JOB
---------
PRESIDENT
MANAGER
CLERK
ANALYST

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 
   1    0   SORT (GROUP BY NOSORT) (Cost=2 Card=10 
   2    1     INDEX (RANGE SCAN) OF 'DESC_JOB_EMP' (INDEX)

It is also stated in the doc : 
"The GROUP BY clause groups rows but does not guarantee the order of the result set. To order the groupings, use the ORDER BY clause."

About the distinct, I did not succeed until yet in finding a query that does not sort per col1,col2,col3... The doc do not specify whether it should sort or not, so thanks for pointing that it is not safe to assume it will ! 

Tom Kyte
June 20, 2005 - 10:38 am UTC

the only way to get sorted data....
use order by.

siva prakash, November 02, 2005 - 6:37 pm UTC

Thanks Kyte .

I learned something new today .

can we recreate this procedure with LOB datatype.


Tom Kyte
November 03, 2005 - 6:51 am UTC

A reader, November 04, 2005 - 1:04 am UTC

Hi Tom,

I modified the stragg for clob datatype. It works perfectly fine for small data. But for large data sets, it takes forever e.g

select employer_code,stragg(cust_name)
from Customer where rownum < 1001
group by employer_code

Customer table has 86,000 rows.

It returns results in few msecs. But when I remove the where clause, it takes close to 2hrs. At times it return with error "unable to extend TEMP tablespace"

I am using 9.2.0.6 dedicated server, with workarea_size_policy=manual, sort_area_size=64MB.

Is there any way, i can optimize it?


Tom Kyte
November 04, 2005 - 3:37 am UTC

build up strings of 32k of data and put them into a clob when they get close to 32k (buffer in a string, append big strings to the clob - instead of tons of small strings)

strAgg

Phil, November 09, 2005 - 11:49 am UTC

Hi Tom
Before we create our own Aggregate function, can you let us know if there is an easier way to do this?
We have a table with a number datatype and we simply want to do a product(column) rather than a sum(column). Is this easier than we think?

Packaged version of stragg

Stewart W. Bryson, January 04, 2006 - 4:37 pm UTC

Tom Kyte said:
"No additional functionality could be gained by a package here."

I wrote a package of ETL mappings that use STRAGG to generate MERGE commands from the dictionary. However, I'd like for this package to contain the STRAGG functionality for portability reasons. Even though there is no additional functionality gained from packaging STRAGG, could it be done anyway for portability?

Thanks.


Tom Kyte
January 05, 2006 - 9:16 am UTC

how would this enhance "portability" in any way?

as shown above, what you see is what you get - you need that function stub to expose it, you need the type to implement it.

what I meant by portability

Stewart W. Bryson, January 06, 2006 - 11:21 am UTC

This package will be rolled out to countless databases by other DBA's managing those systems. Management would like all the functionality inside one package.

If it can't be done... it can't be done. Just checking.

Thanks.

A pure SQL solution for the original "STRAGG" question

Frank Zhou, January 16, 2006 - 1:45 pm UTC

Hi Tom,
     Here is a pure SQL solution for the original "STRAGG" question,
user defined aggregate function "STRAGG" is not required.

Thanks,

Frank
     

SQL> SELECT * FROM STAGE;

CU PR OR OPERATOR                                                                                                       
-- -- -- ---------                                                                                                      
C1 P1 D1 OP1                                                                                                            
C1 P2 D2 OP2                                                                                                            
C1 P3 D3 OP3                                                                                                            
C1 P1 D4 OP3                                                                                                            
C2 P2 D1 OP2                                                                                                            
C2 P3 D2 OP4                                                                                                            
C3 P1 D3 OP1                                                                                                            
C3 P1 D2 OP1                                                                                                            
8 rows selected.

SQL> SELECT customer,
  2  MAX(LTRIM( sys_connect_by_path( product, ',' ) ,  ','))  prod_str,
  3  MAX(LTRIM( sys_connect_by_path( order_date , ',' ) ,  ','))  order_str,
  4  MAX(LTRIM( sys_connect_by_path( OPERATOR  , ',' ) ,  ','))  OPERATOR_str
  5  FROM
  6  (
  7    SELECT customer, product  , order_date , OPERATOR,
  8           row_number() over (PARTITION BY customer ORDER BY ROWID) rn
  9    FROM STAGE
 10  )
 11  START WITH rn = 1
 12  CONNECT BY customer = PRIOR customer AND  PRIOR  rn  = rn -1
 13  GROUP BY customer;

CU                                                                                                                      
--                                                                                                                      
PROD_STR                                                                                                                
------------------------------------------------------------------------------------------------------------------------
ORDER_STR                                                                                                               
------------------------------------------------------------------------------------------------------------------------
OPERATOR_STR                                                                                                            
------------------------------------------------------------------------------------------------------------------------
C1                                                                                                                      
P1,P2,P3,P1                                                                                                             
D1,D2,D3,D4                                                                                                             
OP1,OP2,OP3,OP3                                                                                                         
                                                                                                                        

CU                                                                                                                      
--                                                                                                                      
PROD_STR                                                                                                                
------------------------------------------------------------------------------------------------------------------------
ORDER_STR                                                                                                               
------------------------------------------------------------------------------------------------------------------------
OPERATOR_STR                                                                                                            
------------------------------------------------------------------------------------------------------------------------
C2                                                                                                                      
P2,P3                                                                                                                   
D1,D2                                                                                                                   
OP2,OP4                                                                                                                 
                                                                                                                        

CU                                                                                                                      
--                                                                                                                      
PROD_STR                                                                                                                
------------------------------------------------------------------------------------------------------------------------
ORDER_STR                                                                                                               
------------------------------------------------------------------------------------------------------------------------
OPERATOR_STR                                                                                                            
------------------------------------------------------------------------------------------------------------------------
C3                                                                                                                      
P1,P1                                                                                                                   
D3,D2                                                                                                                   
OP1,OP1                                                                                                                 
                                                                                                                        

SQL> spool off
 

Thanks

Guna Srinivas, March 29, 2006 - 1:36 am UTC

Hi Tom...
Really i appriciate your work... I learn new things everyday. This gives very confident to speak other tech. people..

Thanks for your valuable work...



How do i use stragg in oracle 8i

Guna Srinivas, March 29, 2006 - 1:40 am UTC

Hi Tom,
I am using oracle 8i (Client not ready to move 9i now). So that i could not execute stragg function. Is there anyway can i get the same out for the initial question...

Thanks in advance...



Tom Kyte
March 29, 2006 - 7:12 am UTC

search site for pivot

Reg: A pure SQL solution for the original "STRAGG" question

ST, July 26, 2006 - 3:16 pm UTC

From the above review dated January 16th, 2006 by Frank Zhou from Braintree, MA, Is there a way we can eliminate any duplicates from the list that is being concatenated. Like in the first row we see P1,P2,P3,P1. Can we have only P1,P2,P3 or P1+P2+P3 in the result set? Thanks for your help.

Tom Kyte
July 26, 2006 - 4:10 pm UTC

add some distincting in there. just get the distinct things you want to connect by first.

Reg: A pure SQL solution for the original "STRAGG" question

ST, July 26, 2006 - 4:22 pm UTC

Tried the distinct in the inline view but with the same result as it gets the row numbers as distinct. Still getting the same result (P1,P2,P3,P1).

Tom Kyte
July 26, 2006 - 4:25 pm UTC

if you just want the products, drop the other columns from the select before the distinct (the order date and stuff are not duplicated)

Reg: A pure SQL solution for the original "STRAGG" question

ST, July 26, 2006 - 4:34 pm UTC

No luck even after removing the additional fields. The problem that we are having is because if it gets the rn, then it thinks it is a distinct for that record in the inline view and hence the result is still showing P1 twice with P1,P2,P3,P1 where as I am looking for a result of P1,P2,P3. Thanks for your quick response.

Tom Kyte
July 26, 2006 - 5:03 pm UTC

i'm saying to distinct before that.


ops$tkyte%ORA10GR2> SELECT customer,
  2  MAX(LTRIM( sys_connect_by_path( product, ',' ) ,  ','))  prod_str
  3  FROM
  4  (
  5    SELECT customer, product,
  6           row_number() over (PARTITION BY customer ORDER BY product) rn
  7    FROM (select distinct customer, product from STAGE)
  8  )
  9  START WITH rn = 1
 10  CONNECT BY customer = PRIOR customer AND  PRIOR  rn  = rn -1
 11  GROUP BY customer;

CU PROD_STR
-- ----------------------------------------
C2 P2,P3
C1 P1,P2,P3
C3 P1
 

Reg: A pure SQL solution for the original "STRAGG" question

ST, July 26, 2006 - 5:36 pm UTC

Here is the query as per your suggestion except that I am having two tables in the sub query:

select r2.STDID, r2.term,
max(LTRIM( sys_connect_by_path(r2.location, ' ' ), ' ')) location
from ( select STDID, term, location,
row_number() over (partition by STDID order by STDID ) rn
from (select distinct reg.STDID, sec.term, sec.location
from abc.sections sec, abc.registration reg
where sec.SECTIONID=reg.SECTIONID
and sec.TERM=reg.TERM and sec.TERM='2001/2' )) r2
START WITH rn = 1
CONNECT BY r2.STDID = PRIOR r2.STDID AND PRIOR rn = rn -1
group by r2.VSTDID,r2.vterm;


Here is the result that I am getting where I am looking for 'MC' instead of multiple MC's


STDID|TERM|LOCATION
00001|2001/2|MC MC MC
00002|2001/2|MC
00003|2001/2|DL MC SLB
00004|2001/2|MC MC MC MC
00005|2001/2|MC MC MC MC MC MC MC MC MC
00006|2001/2|MC MC MC MC
00007|2001/2|MC MC MC MC
00008|2001/2|MC MC
00009|2001/2|MC MC MC MC
00010|2001/2|MC
00011|2001/2|MC MC DL
00012|2001/2|MC OEO

Any idea what should be done if there are two tables ?

Tom Kyte
July 26, 2006 - 5:44 pm UTC

create tables
insert intos

this doesn't look right to me, should work.

ST, July 26, 2006 - 6:02 pm UTC

The tables that are here with the required columns and are short listed as there are lot of columns from the original table.

CREATE TABLE REGISTRATION
(
TERM VARCHAR2(7),
STDID VARCHAR2(10),
SECTIONID VARCHAR2(7)
);

CREATE TABLE SECTIONS
(
TERM VARCHAR2(7),
SECTIONID VARCHAR2(21),
LOCATION VARCHAR2(10)
);


The data is comming from the these two tables and provided the results from the original tables. Hope this helps. The results provided are actual results. FYI

Tom Kyte
July 26, 2006 - 6:05 pm UTC

I sort of need some inserts?

looking for that "test case that reproduces the problem"

Reg: A pure SQL solution for the original "STRAGG" question

ST, July 26, 2006 - 6:33 pm UTC

OK here are some sample data inserts:

insert into sections values('2001/2','17459','MC');
insert into sections values('2001/2','18350','MC');
insert into sections values('2001/2','18227','MC');
insert into sections values('2001/2','17186','DL');
insert into sections values('2001/2','17796','MC');
insert into sections values('2001/2','17621','MC');
.......

insert into registration values('2001/2','00002','17459');
insert into registration values('2001/2','00002','18350');
insert into registration values('2001/2','00002','18227');
insert into registration values('2001/2','00181','17186');
insert into registration values('2001/2','00181','17796');
insert into registration values('2001/2','00181','17621');
....

Tom Kyte
July 26, 2006 - 6:46 pm UTC

that'll be a bug and you now have the perfect test case for support when you add these two queries:

ops$tkyte%ORA10GR2> select STDID, term, location,
  2                  row_number() over (partition by STDID order by STDID )  rn
  3             from (select distinct reg.STDID, sec.term, sec.location
  4                      from sections sec, registration reg
  5                      where sec.SECTIONID=reg.SECTIONID
  6                   and sec.TERM=reg.TERM and sec.TERM='2001/2' )
  7  /

STDID      TERM    LOCATION           RN
---------- ------- ---------- ----------
00002      2001/2  MC                  1
00181      2001/2  DL                  1
00181      2001/2  MC                  2

ops$tkyte%ORA10GR2> select rpad('*',2*level,'*') || r2.location loc, rn,
  2         r2.STDID, r2.term,
  3         (LTRIM( sys_connect_by_path(r2.location, ' ' ), ' ')) location
  4    from ( select STDID, term, location,
  5                  row_number() over (partition by STDID order by STDID )  rn
  6             from (select distinct reg.STDID, sec.term, sec.location
  7                      from sections sec, registration reg
  8                      where sec.SECTIdiONID=reg.SECTIONID
  9                   and sec.TERM=reg.TERM and sec.TERM='2001/2' )) r2
 10    START WITH rn = 1
 11    CONNECT BY r2.STDID = PRIOR r2.STDID AND  PRIOR  rn  = rn -1
 12  /

LOC                     RN STDID      TERM    LOCATION
--------------- ---------- ---------- ------- ----------
**MC                     1 00002      2001/2  MC
**DL                     1 00181      2001/2  DL
****DL                   2 00181      2001/2  DL DL
 
clearly shows the location is getting munged some how

the MC got turned into DL!

Here is the workaround (sort by something other than stdid)


ops$tkyte%ORA10GR2> select r2.STDID, r2.term,
  2         max(LTRIM( sys_connect_by_path(r2.location, ' ' ), ' ')) location
  3    from ( select STDID, term, location,
  4                  row_number() over (partition by STDID order by location )  rn
  5             from (select distinct reg.STDID, sec.term, sec.location
  6                      from sections sec, registration reg
  7                      where sec.SECTIONID=reg.SECTIONID
  8                   and sec.TERM=reg.TERM and sec.TERM='2001/2' )) r2
  9    START WITH rn = 1
 10    CONNECT BY r2.STDID = PRIOR r2.STDID AND  PRIOR  rn  = rn -1
 11   group by r2.STDID,r2.term;

STDID      TERM    LOCATION
---------- ------- ----------
00181      2001/2  DL MC
00002      2001/2  MC

 

Reg: A pure SQL solution for the original "STRAGG" question

ST, July 26, 2006 - 7:02 pm UTC

Thanks for your quick response and appreciate your help. But the problem still persists using the script provided by you and I am still getting 3 MC's for stdid = '00002' and so on for other stdid's. I am on Oracle 9.2.0.4 Does it make any difference with this version?

Tom Kyte
July 26, 2006 - 7:07 pm UTC

there appears to be a bug - please contact support, as I said the data is definitely munged up there.

If you go back to your original source data, you'll find there are not 3 MC's for that stdid, it is a bug.

ST, July 26, 2006 - 7:09 pm UTC

Thank you very much and appreciate for all your help. I will check back with the support on this issue.

STRAGG

ST, July 27, 2006 - 2:46 pm UTC

OK Now I was able to get the concatenated output using the STRAGG. Another issue with using STRAGG is I am unable to sort based on START WITH condition. Is there any way we can put conditions so that we can concatenate the sorted values in the string as per the requirements?



Tom Kyte
July 27, 2006 - 4:03 pm UTC

Stragg with clob and free memory

Ron, October 02, 2006 - 11:45 am UTC

Hi Tom,

we modify the fonction stragg for Clob variable in output. We had some problems with the clob free memory (clob are not free of the memory).

Here is the code of the new clob fonction:

drop function stragg_clobs;
drop type stragg_clob;



create or replace type stragg_clob as object
(
total clob,
static function
ODCIAggregateInitialize(sctx IN OUT stragg_clob )
return number,
member function
ODCIAggregateIterate(self IN OUT stragg_clob ,value IN varchar2 )
return number,
member function
ODCIAggregateTerminate(self IN OUT stragg_clob,
returnValue OUT clob,
flags IN number)
return number,
member function
ODCIAggregateMerge(self IN OUT stragg_clob,
ctx2 IN stragg_clob)
return number
);
/

--Type created.

create or replace type body stragg_clob
is
static function ODCIAggregateInitialize(sctx IN OUT stragg_clob)
return number
is
begin
sctx := stragg_clob( null );
return ODCIConst.Success;
end;
member function ODCIAggregateIterate(self IN OUT stragg_clob,
value IN varchar2 )
return number
is
begin
self.total := self.total || ',' || value;
return ODCIConst.Success;
end;

member function ODCIAggregateTerminate(self IN OUT stragg_clob,
returnValue OUT clob,
flags IN number)
return number
is
begin
returnValue := ltrim(self.total,',');
dbms_lob.freetemporary(self.total);
return ODCIConst.Success;
end;
member function ODCIAggregateMerge(self IN OUT stragg_clob,
ctx2 IN stragg_clob)
return number
is
begin
self.total := self.total || ctx2.total;

return ODCIConst.Success;
end;
end;
/
show error
--Type body created.


CREATE or replace
FUNCTION stragg_clobs(input varchar2 )
RETURN clob
PARALLEL_ENABLE AGGREGATE USING stragg_clob;
/



grant execute on stragg_clob to public;
--

grant execute on stragg_clobs to public;
--
drop public synonym stragg_clobs ;
create public synonym stragg_clobs for stragg_clobs;

if we do some query with this new fonction, we can see the memory is not free between each execution.

We use

select * from v$temporary_lobs

to check the cache_lobs.

If we use this new fonction with a lot of data, Oracle return us popup with the message "out of memory"

How can we free the memory so we won't have this message

Thanks
Ron

Tom Kyte
October 02, 2006 - 11:48 am UTC

is the client letting the clobs be freed? (eg: freeing the clob). You are sort of letting them copy the clob when they fetch it (think about what would happen if YOU closed THEIR clob - say I called your function and it returned to me 10 rows. I have now 10 clobs in my client program, but you think you "closed" them??)



how to get this

Rahul, November 08, 2006 - 7:20 pm UTC

i have a table containing 2 columns

position component_item
1 786
1 787
0 789
1 790
1

Now based on the position i have to construct a string like

786-787-null-789-790

whereever there is zero in the position i have to put a null value there and then construct a string

the position 0 can be any where in the outmost five records
and the component item is going to be 4
so i have to buid the string like given above

Tom Kyte
November 08, 2006 - 8:15 pm UTC

use stragg on

decode( position, 0, 'null-', '' ) || component_item

A 10G alternative to my 9I pure SQL STRAGG soultion

Frank Zhou, November 22, 2006 - 3:50 pm UTC

Hi Tom,

      Here is a 10G SQL Model Clause alternative to my 9i pure sql STRAGG soultion posted on January 16, 2006. SQL Model Clause is really awsome!! This is an other SQL Evolution...  

Thanks,

Frank

SQL> select customer,Prod_Str, order_str, OPERATOR_str
  2  from
  3   (select customer, Prod_Str, OPERATOR_str, order_str, rn, counter
  4   from stage
  5   model
  6   partition by (customer)
  7   dimension by (row_number()over(PARTITION BY customer ORDER BY rowid) rn )
      measures (product, order_date, operator, CAST(NULL AS VARCHAR2(3999)) 
               Prod_Str, CAST(NULL AS VARCHAR2(3999)) order_str, 
               CAST(NULL AS VARCHAR2(3999)) OPERATOR_str,
 10   count(*) over (partition by customer) counter
 11   )
 12   RULES (
 13   Prod_Str[ANY] ORDER BY rn  =
 14                 CASE WHEN product[cv() - 1 ] IS NULL
 15            THEN product[cv()]
 16                     ELSE Prod_Str[cv()-1]||','|| product[cv()]
 17        END ,
 18   order_str[ANY] ORDER BY rn  =
 19                 CASE WHEN order_date[cv() - 1 ] IS NULL
 20            THEN order_date[cv()]
 21                     ELSE order_str[cv()-1]||','|| order_date[cv()]
 22        END ,
 23   OPERATOR_str[ANY] ORDER BY rn  =
 24                 CASE WHEN operator[cv() - 1 ] IS NULL
 25            THEN operator[cv()]
 26                     ELSE OPERATOR_str[cv()-1]||','|| operator[cv()]
 27        END
 28    )
 29   )
 30  where counter = rn;

CU                                                                              
--                                                                              
PROD_STR                                                                        
--------------------------------------------------------------------------------
ORDER_STR                                                                       
--------------------------------------------------------------------------------
OPERATOR_STR                                                                    
--------------------------------------------------------------------------------
C1                                                                              
P1,P2,P3,P1                                                                     
D1,D2,D3,D4                                                                     
OP1,OP2,OP3,OP3                                                                 
                                                                                

CU                                                                              
--                                                                              
PROD_STR                                                                        
--------------------------------------------------------------------------------
ORDER_STR                                                                       
--------------------------------------------------------------------------------
OPERATOR_STR                                                                    
--------------------------------------------------------------------------------
C2                                                                              
P2,P3                                                                           
D1,D2                                                                           
OP2,OP4                                                                         
                                                                                

CU                                                                              
--                                                                              
PROD_STR                                                                        
--------------------------------------------------------------------------------
ORDER_STR                                                                       
--------------------------------------------------------------------------------
OPERATOR_STR                                                                    
--------------------------------------------------------------------------------
C3                                                                              
P1,P1                                                                           
D2,D3                                                                           
OP1,OP1                                                                         
                                                                                 

Re: A 10G alternative to my 9I pure SQL STRAGG soultion

Laurent Schneider, November 27, 2006 - 9:38 am UTC

a nice use of the MODEL clause

Another MODEL Solution

SnippetyJoe, February 22, 2007 - 1:37 pm UTC


In case anyone is interested, here's another way to use MODEL to perform string aggregation.

create table t6( group_key varchar2(10), val varchar2(10) );

insert into t6 values ( 'Group 1' , 'a'  );
insert into t6 values ( 'Group 2' , 'a'  );
insert into t6 values ( 'Group 2' , 'b'  );
insert into t6 values ( 'Group 3' , 'a'  );
insert into t6 values ( 'Group 3' , 'b'  );
insert into t6 values ( 'Group 3' , 'c'  );
insert into t6 values ( 'Group 4' , 'a'  );
insert into t6 values ( 'Group 4' , 'a'  );
insert into t6 values ( 'Group 4' , 'b'  );
insert into t6 values ( 'Group 4' , 'b'  );
insert into t6 values ( 'Group 5' , 'a'  );
insert into t6 values ( 'Group 5' , 'b'  );
insert into t6 values ( 'Group 5' , null );
insert into t6 values ( 'Group 5' , 'd'  );
insert into t6 values ( 'Group 5' , 'e'  );
insert into t6 values ( 'Group 6' , null );

commit;

column group_key format a10
column string    format a15

select
  group_key ,
  substr( string, 2 ) as string
from
  dual
where
  1 = 2
model
  reference t6_ref
    on
    (
      select
        row_number() over ( order by val ) - 1 as row_num ,
        count(*) over () - 1                   as max_row_num ,
        group_key ,
        val
      from
        t6
      where
        val is not null
      order by
        val
    )
    dimension by( row_num )
    measures    ( max_row_num, group_key, val )
  main t6_main
    dimension by ( cast( null as varchar2(4000) ) as group_key )
    measures     ( cast( null as varchar2(4000) ) as string )
    rules
      iterate( 4294967295 )
      until
      ( t6_ref.max_row_num[0] is null or
        iteration_number >= t6_ref.max_row_num[0]
      )
    (
      string[ t6_ref.group_key[iteration_number] ] =
        string[ cv() ] || ',' ||
        t6_ref.val[ iteration_number ]
    )
order by
  1
;

GROUP_KEY  STRING
---------- ---------------
Group 1    a
Group 2    a,b
Group 3    a,b,c
Group 4    a,a,b,b
Group 5    a,b,d,e


You can find a detailed explanation of how this query works here.
http://www.sqlsnippets.com/en/topic-11754.html
A performance comparison between this method and hierarchical, object relational, and STRAGG solutions is available here.
http://www.sqlsnippets.com/en/topic-11783.html

Using STRAGG in a child table join

Bill Ferguson, March 23, 2007 - 9:38 am UTC

Hi Tom,

Your stragg function is great, but I am having difficulties figuring out how to use it in a join for child tables with multiple entries. I can use it on one table without problems, but I can't figure out how to use it with a join on child tables.

I would like the parent table information with the child table information STRAGG'ed.

My (abbreviated) tables are:
CREATE TABLE DEPOSITS_BASE
(
  DEP_ID            NUMBER(12)                  NOT NULL,
  DEV_ST            VARCHAR2(25 BYTE)           NOT NULL,
  OPER_TP           VARCHAR2(30 BYTE)           NOT NULL,
  SIG               VARCHAR2(1 BYTE));
/

Insert into DEPOSITS_BASE (DEP_ID,DEV_ST,OPER_TP,SIG) 
   values (55555555,'Past Producer','Underground','N');
/


CREATE TABLE COMMODITY_BASE
(
  DEP_ID       NUMBER(12)                       NOT NULL,
  LINE         NUMBER(4)                        NOT NULL,
  COMMOD_GROUP VARCHAR2(40 BYTE)                NOT NULL);
/

Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,5,'Abrasives');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,2,'Tungsten');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,4,'Copper');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,1,'Gold');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,3,'Silver');
/

CREATE TABLE LOCATIONS_BASE
(
  DEP_ID       NUMBER(12)                       NOT NULL,
  LINE         NUMBER(4)                        NOT NULL,
  COUNTRY      VARCHAR2(70 BYTE),
  STATE_PROV   VARCHAR2(80 BYTE),
  COUNTY       VARCHAR2(80 BYTE) );
/

Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,1,'United States','Colorado','Adams');
Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,3,'United States','Montana','Deer Lodge');
Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,2,'United States','Nevada','White Pine');
/


The DEP_ID field of course is the common link between the tables.

Ideally I would like a one line output (per DEP_ID) of the STRAGG'ed output sorted by LINE as well, as that signifies importance to the end users. So, with the above sample data, what the end-users would like would be all the data from DEPOSITS_BASE, the STRAGG'ed COMMOD_GROUP, and somehow the STRAGG'ed output of COUNTRY, STATE_PROV, and COUNTY.

I'm completely lost trying to figure out how to accomplish this, if it even can be.

Thanks,
Bill Ferguson
Tom Kyte
March 23, 2007 - 10:08 am UTC

child tables have nothing to do with it - it is like using avg, sum, min, max.

It is just an aggregate function.


How about this - you supply the query to join the above tables together and then, we'll just wrap that query (call it Q) in ()


select *
from (Q)


Now, we can just

select column, stragg(another_column)
from (Q)
group by column;


Using STRAGG in a child table join (revisited)

Bill Ferguson, March 23, 2007 - 11:12 am UTC

Hi again Tom,

I must be a real dummy. I just can't seem to get it to work correctly with three tables.

Doing the following (with two tables) works fine:

SELECT dep_id, stringagg(commod_group) commod_group --, stringagg(country) country,
   --stringagg(state_prov) state_prov, stringagg(county) county
FROM (
SELECT dep_id, dev_st, oper_tp, sig, d.NAME, commod_group --,
--       country, state_prov, county
FROM ( --( 
 deposits d LEFT OUTER JOIN commodity c USING (dep_id))
--   LEFT OUTER JOIN locations l USING (dep_id))
WHERE dep_id = 55555555   
)
GROUP BY dep_id


I get the one line output with the commod_group aggregated together.

When I remove the '--' comment markers and re-run it, then I get back multiples of each commod_group, one for each entry in the locations_base table.

I'll get back 3 Gold's, 3 Silver's, etc. I can see why, with the join creating a line for each combination in the enclosed query, but I can't figure out how to (I guess) get a distinct stringagg output per column.

Tom Kyte
March 23, 2007 - 11:14 am UTC

give us your query table query

Using STRAGG in a child table join (revisited again)

Bill Ferguson, March 23, 2007 - 1:17 pm UTC

Okay, I guess maybe these questions are sent as emails, so seeing the previous 'post' is difficult?

Anyway, I'll consolidate the above two posts into one.

I would like the parent table information with the child table information STRAGG'ed.

My (abbreviated) tables are:
CREATE TABLE DEPOSITS_BASE
(
  DEP_ID            NUMBER(12)                  NOT NULL,
  DEV_ST            VARCHAR2(25 BYTE)           NOT NULL,
  OPER_TP           VARCHAR2(30 BYTE)           NOT NULL,
  SIG               VARCHAR2(1 BYTE));
/

Insert into DEPOSITS_BASE (DEP_ID,DEV_ST,OPER_TP,SIG) 
   values (55555555,'Past Producer','Underground','N');
/


CREATE TABLE COMMODITY_BASE
(
  DEP_ID       NUMBER(12)                       NOT NULL,
  LINE         NUMBER(4)                        NOT NULL,
  COMMOD_GROUP VARCHAR2(40 BYTE)                NOT NULL);
/

Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,5,'Abrasives');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,2,'Tungsten');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,4,'Copper');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,1,'Gold');
Insert into COMMODITY_BASE (DEP_ID,LINE,COMMOD_GROUP) 
  values (55555555,3,'Silver');
/

CREATE TABLE LOCATIONS_BASE
(
  DEP_ID       NUMBER(12)                       NOT NULL,
  LINE         NUMBER(4)                        NOT NULL,
  COUNTRY      VARCHAR2(70 BYTE),
  STATE_PROV   VARCHAR2(80 BYTE),
  COUNTY       VARCHAR2(80 BYTE) );
/

Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,1,'United States','Colorado','Adams');
Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,3,'United States','Montana','Deer Lodge');
Insert into LOCATIONS_BASE (DEP_ID,LINE,COUNTRY,STATE_PROV,COUNTY) 
  values (55555555,2,'United States','Nevada','White Pine');
/


Ideally I would like a one line output (per DEP_ID) of the STRAGG'ed output sorted by LINE as well, as that signifies importance to the end users. So, with the above sample data, what the end-users would like would be all the data from DEPOSITS_BASE, the STRAGG'ed COMMOD_GROUP, and somehow the STRAGG'ed output of COUNTRY, STATE_PROV, and COUNTY.

If I do:
SELECT dep_id, stringagg(commod_group) commod_group --, stringagg(country) country,
   --stringagg(state_prov) state_prov, stringagg(county) county
FROM (
SELECT dep_id, dev_st, oper_tp, sig, d.NAME, commod_group --,
--       country, state_prov, county
FROM ( --( 
 deposits_base d LEFT OUTER JOIN commodity_base c USING (dep_id))
--   LEFT OUTER JOIN locations_base l USING (dep_id))
WHERE dep_id = 55555555   
)
GROUP BY dep_id;


It works fine. The commod_group is STRAGG'ed together as expected. If I remove the comment markers and rerun it, then the commod_group, country, state_prov, and county entries are repeated for each additional join against LOCATIONS_BASE.
Tom Kyte
March 26, 2007 - 6:59 am UTC

well, I don't believe this to be a "stragg" problem at all - but rather just a "my data model doesn't support my question" (replace stragg with ANY OTHER AGGREGATE like min or max and you'll see what I mean)

You have what appears to be a 1:m + 1:m relationship here.

Say you have a relation DEPT TO EMP - that is one to many. And you have a deptno with 5 employees - you expect "5 rows"

Say you have a relation DEPT TO DEPT_MANAGERS and it is one to many. And that deptno with 5 employess - has 2 managers.

Say you join dept to emp and dept to dept_managers in a single query. You get 10 rows (5x2) for that deptno.

There are no duplicates, you are just getting the natural result of two 1:m relations in a single query.

I cannot get your query to run - some word wrapping going on. Best not to be tricky with comments like that - not sure what goes on which line from this end. I see the last comment really only - I'd have to page up and up and up to rebuild any sort of context.

But, this is not stragg - this is just a natural side effect of your data. That is what I asked for a query without aggregates that had the right data - sort of thinking "such a query doesn't exist"

Another way of using SQL modeling

Charlie Zhu, April 19, 2007 - 6:10 pm UTC

http://tkyte.blogspot.com/2006/08/evolution.html
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:15637744429336

String Aggregate, Another way of using SQL modeling.

3 steps to illustrate it:

step 1,

select job, ename,
row_number() over ( partition by job order by ename ) as position
, count(*) over ( partition by job ) emp_cnt
from scott.emp;

JOB ENAME POSITION EMP_CNT
--------- ---------- ---------- ----------
ANALYST FORD 1 2
ANALYST SCOTT 2 2
CLERK ADAMS 1 4
CLERK JAMES 2 4
CLERK MILLER 3 4
CLERK SMITH 4 4
MANAGER BLAKE 1 3
MANAGER CLARK 2 3
MANAGER JONES 3 3
PRESIDENT KING 1 1
SALESMAN ALLEN 1 4
SALESMAN MARTIN 2 4
SALESMAN TURNER 3 4
SALESMAN WARD 4 4

14 rows selected.

Elapsed: 00:00:00.04

Execution Plan
----------------------------------------------------------
Plan hash value: 3145491563

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 196 | 3 (34)| 00:00:01 |
| 1 | WINDOW SORT | | 14 | 196 | 3 (34)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMP | 14 | 196 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------
step 2,

select job, empname, position, emp_cnt, enames from
(
select job, ename,
row_number() over ( partition by job order by ename ) as position
, count(*) over ( partition by job ) emp_cnt
from scott.emp
)
MODEL
RETURN UPDATED ROWS
MAIN simple_model
PARTITION BY (job)
DIMENSION BY (position)
MEASURES (ename empname, ' ' enames, emp_cnt emp_cnt)
RULES
(enames[any] = empname[CV()] ||','||enames[CV()-1]
)
;

JOB EMPNAME POSITION EMP_CNT ENAMES
--------- ---------- ---------- ---------- --------------------------------
CLERK ADAMS 1 4 ADAMS,
CLERK JAMES 2 4 JAMES,ADAMS,
CLERK MILLER 3 4 MILLER,JAMES,ADAMS,
CLERK SMITH 4 4 SMITH,MILLER,JAMES,ADAMS,
SALESMAN ALLEN 1 4 ALLEN,
SALESMAN MARTIN 2 4 MARTIN,ALLEN,
SALESMAN TURNER 3 4 TURNER,MARTIN,ALLEN,
SALESMAN WARD 4 4 WARD,TURNER,MARTIN,ALLEN,
PRESIDENT KING 1 1 KING,
MANAGER BLAKE 1 3 BLAKE,
MANAGER CLARK 2 3 CLARK,BLAKE,
MANAGER JONES 3 3 JONES,CLARK,BLAKE,
ANALYST FORD 1 2 FORD,
ANALYST SCOTT 2 2 SCOTT,FORD,

14 rows selected.

Elapsed: 00:00:00.04

Execution Plan
----------------------------------------------------------
Plan hash value: 2296444387

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 546 | 3 (34)| 00:00:01 |
| 1 | SQL MODEL ORDERED | | 14 | 546 | | |
| 2 | VIEW | | 14 | 546 | 3 (34)| 00:00:01 |
| 3 | WINDOW SORT | | 14 | 196 | 3 (34)| 00:00:01 |
| 4 | TABLE ACCESS FULL| EMP | 14 | 196 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------

step 3,

select * from
(
select job, empname, position, emp_cnt, enames from
(
select job, ename,
row_number() over ( partition by job order by ename ) as position
, count(*) over ( partition by job ) emp_cnt
from scott.emp
)
MODEL
RETURN UPDATED ROWS
MAIN simple_model
PARTITION BY (job)
DIMENSION BY (position)
MEASURES (ename empname, ' ' enames, emp_cnt emp_cnt)
RULES
(enames[any] = empname[CV()] ||Nvl2(enames[CV()-1],',','')||enames[CV()-1]
)
)
where position = emp_cnt
;

JOB EMPNAME POSITION EMP_CNT ENAMES
--------- ---------- ---------- ---------- --------------------------------
CLERK SMITH 4 4 SMITH,MILLER,JAMES,ADAMS
SALESMAN WARD 4 4 WARD,TURNER,MARTIN,ALLEN
PRESIDENT KING 1 1 KING
MANAGER JONES 3 3 JONES,CLARK,BLAKE
ANALYST SCOTT 2 2 SCOTT,FORD

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3377437572

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 588 | 3 (34)| 00:00:01 |
|* 1 | VIEW | | 14 | 588 | 3 (34)| 00:00:01 |
| 2 | SQL MODEL ORDERED | | 14 | 546 | | |
| 3 | VIEW | | 14 | 546 | 3 (34)| 00:00:01 |
| 4 | WINDOW SORT | | 14 | 196 | 3 (34)| 00:00:01 |
| 5 | TABLE ACCESS FULL| EMP | 14 | 196 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------

stragg

Jay, September 04, 2007 - 3:12 pm UTC

Hello Tom,
Thanks a bunch for letting us know about the stragg function. Especially, I love this revision that you have made where it sorts the aggregated data. When you get a chance, could you please explain to me what you are doing here?? I'm like the Alice who's lost in the wonderland here.

I would be very thankful to you if you could please guide me Tom. Thank you so much and I apologize for the bother!

ops$tkyte@ORA9IR2> create or replace type vcArray as table of varchar2(4000)
  2  /
Type created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace type string_agg_type as object
  2  (
  3     data  vcArray,
  4
  5     static function
  6          ODCIAggregateInitialize(sctx IN OUT string_agg_type )
  7          return number,
  8
  9     member function
 10          ODCIAggregateIterate(self IN OUT string_agg_type ,
 11                               value IN varchar2 )
 12          return number,
 13
 14     member function
 15          ODCIAggregateTerminate(self IN string_agg_type,
 16                                 returnValue OUT  varchar2,
 17                                 flags IN number)
 18          return number,
 19
 20     member function
 21          ODCIAggregateMerge(self IN OUT string_agg_type,
 22                             ctx2 IN string_agg_type)
 23          return number
 24  );
 25  /
 
Type created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace type body string_agg_type
  2  is
  3
  4  static function ODCIAggregateInitialize(sctx IN OUT string_agg_type)
  5  return number
  6  is
  7  begin
  8      sctx := string_agg_type( vcArray() );
  9      return ODCIConst.Success;
 10  end;
 11
 12  member function ODCIAggregateIterate(self IN OUT string_agg_type,
 13                                       value IN varchar2 )
 14  return number
 15  is
 16  begin
 17      data.extend;
 18      data(data.count) := value;
 19      return ODCIConst.Success;
 20  end;
 21
 22  member function ODCIAggregateTerminate(self IN string_agg_type,
 23                                         returnValue OUT varchar2,
 24                                         flags IN number)
 25  return number
 26  is
 27      l_data varchar2(4000);
 28  begin
 29      for x in ( select column_value from TABLE(data) order by 1 )
 30      loop
 31              l_data := l_data || ',' || x.column_value;
 32      end loop;
 33      returnValue := ltrim(l_data,',');
 34      return ODCIConst.Success;
 35  end;
 36
 37  member function ODCIAggregateMerge(self IN OUT string_agg_type,
 38                                     ctx2 IN string_agg_type)
 39  return number
 40  is
 41  begin -- not really tested ;)
 42      for i in 1 .. ctx2.data.count
 43      loop
 44              data.extend;
 45              data(data.count) := ctx2.data(i);
 46      end loop;
 47      return ODCIConst.Success;
 48  end;
 49
 50
 51  end;
 52  /
 
Type body created.
 
ops$tkyte@ORA9IR2> show err
No errors.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> CREATE or replace
  2  FUNCTION stragg(input varchar2 )
  3  RETURN varchar2
  4  PARALLEL_ENABLE AGGREGATE USING string_agg_type;
  5  /
 
Function created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> column ename format a40
ops$tkyte@ORA9IR2> select deptno, stragg(ename) ename
  2    from emp
  3   group by deptno
  4  /
 
    DEPTNO ENAME
---------- ----------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD


Tom Kyte
September 05, 2007 - 1:54 pm UTC

which part needs to be explained?

thanks for the response..

Jay, September 05, 2007 - 3:18 pm UTC

Thanks for the response Tom. Just a brief overview of the logic you've incorporated here would be very helpful.
How can I start thinking to come up with a logic like that in the creation of type body? How do I know when to use the ODCIAggregateTerminate, Initialize, Merge etc, functions. Where do I get a list of complete functions such as this? How do I decide when to do a type body or a function or a package or a stored procedure? I am sorry for asking some really weird questions, but, I have no prior experience with the pl/sql programming and want to get better at it. I am just amazed by this site and your skills. Thanks a lot Tom. You are a gift to the oracle world!

Thanks!

Jay, September 06, 2007 - 9:27 am UTC

Thanks for your time and the link Tom.
Have a wonderful day!

sorted stragg

Sean, December 10, 2007 - 11:43 am UTC

I've seen many mentionings of aggregates applying to SETS, which is why ordering doesn't apply. I understand that.

But, what about analytics?

stragg is usable as an analytic as well as an aggregate.

I've used the following syntax with stragg on many occasions successfully. Have I just been getting lucky?

SELECT DISTINCT deptno,
stragg(ename) OVER(PARTITION BY deptno
ORDER BY ename DESC
ROWS BETWEEN
UNBOUNDED PRECEDING
AND
UNBOUNDED FOLLOWING) AS enames
FROM emp

If this is a bad idea, please explain why and also explain how this could be accomplished with the connect by trick.

Thanks!
Tom Kyte
December 10, 2007 - 12:13 pm UTC

it works - it is not the most "efficient" method, but it works.

not working for me :-(

Jay, December 10, 2007 - 3:25 pm UTC

Hi Tom,
Could you please tell me why the sorting wont work for me?
with test as
(select '12345' as ids, 21 as code from dual             
 union all
 select '12345' as ids, 26 as code from dual                
 union all
 select '12345' as ids, 28 as code from dual)
 
select distinct ids,
       stragg(code) over (partition by ids order by code desc 
                          rows between unbounded preceding and unbounded following) as code
  from test 


I get '21,26,28' as the output. Shouldn't it be '28,26,21'? Am I missing something?

Thank you for your help as always!

Jay
Tom Kyte
December 10, 2007 - 4:21 pm UTC

ops$tkyte%ORA9IR2> with test as
  2  (select '12345' as ids, 21 as code from dual
  3   union all
  4   select '12345' as ids, 26 as code from dual
  5   union all
  6   select '12345' as ids, 28 as code from dual)
  7  select distinct ids,
  8         stragg(code) over (partition by ids order by code desc
  9                            rows between unbounded preceding and unbounded following) as code
 10    from test
 11  /

IDS   CODE
----- --------------------
12345 28,26,21


I get that in 9i, 10g, and 11g...



Oh my God! Spoke to soon...

Jay, December 10, 2007 - 3:37 pm UTC

Sorry Tom. I found out the issue. I used the stragg script that does the sorting.

member function ODCIAggregateTerminate(self IN string_agg_type,
returnValue OUT varchar2,
flags IN number)
return number
is
l_data varchar2(4000);
begin
for x in ( select column_value from TABLE(data) order by 1 )
loop
l_data := l_data || ', ' || x.column_value;
end loop;
returnValue := ltrim(l_data,',');
return ODCIConst.Success;
end;

This was the problem :-(

Sorry,
Jay

thanks Tom

Jay, December 10, 2007 - 4:49 pm UTC

Yes Tom. You are right.
Sorry for wasting your time. I almost immediately replied to that question myself with the issue.
Thanks again!
Jay

Question [out of curiosity]

Jay, December 11, 2007 - 1:42 pm UTC

Hi Tom,
Good day! In one of your comments you had said this regarding the sorting of the stragg function.

"Followup December 10, 2007 - 12pm US/Eastern:
it works - it is not the most "efficient" method, but it works. "

Could you please let us know if there is a more efficient way to do the same when you have some time?

Thanks a lot for your help!
Jay

Tom Kyte
December 11, 2007 - 9:28 pm UTC

read original answer?

thanks!

Jay, December 12, 2007 - 4:22 pm UTC

Thanks a lot Tom.

Can you please clarify one thing for me? The other user posted a way to sort using analytics.

code from the other user:

SELECT DISTINCT deptno,
                stragg(ename) OVER(PARTITION BY deptno 
                                   ORDER BY ename DESC 
                                   ROWS BETWEEN 
                                        UNBOUNDED PRECEDING
                                        AND 
                                        UNBOUNDED FOLLOWING)    AS enames
FROM emp


Now, instead of doing analytics and distinct, can I just do the following query to sort? This is more simple and I think it will give me the right data. Can you please advice?

My code:

select deptno,
       stragg(ename)
  from (select deptno,                         
               ename
          from emp
        order by deptno,
                 ename desc)
group by deptno  


thanks for your time!

Jay
Tom Kyte
December 13, 2007 - 9:09 am UTC

no, you cannot - the data fed to the aggregate is fed in a non-determined fashion - we can do a hash group by for example...

it might appear to work by accident - but it would not be assured.

thanks

Jay, December 13, 2007 - 9:25 am UTC

Okay. Thanks for your reply!

I am going to be attending the conference this evening in Reston Tom! I can't wait to see you talk (LIVE, In-Person). W0W! It's going to be great.

Thanks for the opportunity! I already made my reservation. Woohooooooo...

Thanks!
Jay

awesome presentation..

Jay, December 14, 2007 - 9:43 am UTC

Hi Tom,

It was a great presentation yesterday at the reston oracle center. It was indeed a pleasure to see you in person and hear you talk!

I have one quick suggestion. You can completely ignore this comment though ;-)

If future presentations, I think it would be really helpful for everyone if you please ask the audience to wait until the end of the session for the questions? There were just way too many questions (by the same two or three 'smartass-know-it-alls' in the audience) during the session and thus, you couldn't even get through half of the slides.

Just a thought :-)

Thanks again!

Jay
Tom Kyte
December 14, 2007 - 1:22 pm UTC

I give seminars - they last 2 to 3 days. The questions are mandatory - I'd be done in a day if it were not for them.

The questions are really pretty good actually, I was commenting to a friend afterwards that I thought the questions were actually quite good - David Aldridge was the fellow that asked me a question that I said "I don't know" to (about encrypted tablespaces and transporting) :) and he emailed me this morning:

...
David Aldridge wrote:
> Documented as a restriction, of course.
>
> http://docs.oracle.com/docs/cd/B28359_01/server.111/b28310/tspaces013.htm#i1007233
>
> It's interesting that one apparent advantage of tablespace encryption is that you can transport them, albeit with restrictions, whereas non-encrypted tablespaces with tables having encrypted columns cannot be. That might be a valid reason to choose TS encryption over column encryption.
....

So find the questions good - I encourage them.

but yes, I did not expect to spend 25 minutes on my first slide :) that did put us behind schedule a bit!

Stragg

DK, April 15, 2008 - 10:23 am UTC

Hi Tom

Thank you for all the knowledge you have been sharing .

I have been trying to use stragg for converting rows to columns

I have the following table and data

create table listingtest (listingid varchar2(100),categoryid varchar2(20),categoryname varchar2(300))

tablespace vztables2;

insert into listingtest values('0001','5000','Taxis');

insert into listingtest values('0001','5001','Limousine service');

insert into listingtest values('0001','5002','Transportation service');

insert into listingtest values('0002','5004','ATM servcies');

insert into listingtest values('0002','50005','Banking services');

insert into listingtest values('0003','5000','Taxis');

insert into listingtest values('0003','5004','ATM services');

select * from listingtest

LISTINGID CATEGORYID CATEGORYNAME
------------------------- -------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------
0001 5000 Taxis
0001 5001 Limousine service
0001 5002 Transportation service
0002 5004 ATM servcies
0002 50005 Banking services
0003 5000 Taxis
0003 5004 ATM services


I used stragg to get

select listingid,stragg(categoryid) categoryid,stragg(categoryname) categoryname from listingtest group

by listingid
LISTINGID CATEGORYID CATEGORYNAME


------------------------- -------------------------

--------------------------------------------------------------------
0001 5000,5001,5002 Taxis,Transportation service,Limousine service


0002 5004,50005 ATM servcies,Banking services


0003 5000,5004 Taxis,ATM services

I need something like

LISTINGID CATEGORYID CATEGORYNAME CATEGORYID CATEGORYNAME CATEGORYID CATEGORYNAME CATEGORYID CATEGORYNAME

0001 5000 Taxis 5001 Transportation service 5002 Limousine service

0002 5004 ATM servcies 50005 Banking services

0003 5000 Taxis 5004 ATM services


Please help

thanks
DK


Tom Kyte
April 16, 2008 - 3:00 pm UTC

search this site for PIVOT


can this be done in msaccess

Duke, September 24, 2008 - 5:43 am UTC

Tom, so so so sorry to have to ask this;
my project leader want to do this in a small access database.
I thought about defining hetrogenous services and do it with stragg, but he said he want to be able to do this from access. I can't figure out how to do this in access, and i doubt wheter i want to do this, but he asked me.
So once again sorry to ask, but do you know of a smiliar concept in access?
Tom Kyte
September 24, 2008 - 7:22 pm UTC

I am sorry - I do not know :)

stragg ora-06502

Jay, October 07, 2008 - 5:29 pm UTC

Hello Tom,
The function 'stragg' has been very helpful indeed. For the first time, I get the following error with this function

ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at "STRING_AGG_TYPE", line 31

Line 31 refers to the following piece of code in the type body -

l_data := l_data || ', ' || x.column_value;

Could you please let me know what could be going wrong?

Thanks for your time!
Jay


Tom Kyte
October 08, 2008 - 9:58 pm UTC

stragg as I defined it work with varchar2(4000)

you must need more

but then you are into the world of clobs. search - you'll find an implementation that does clobs - but I would recommend against it, the amount of data is probably just too long at that point.

order and ODCIAggregateMerge

Alex, April 17, 2009 - 5:16 am UTC

Hi, Tom!
Does "order by" in "stragg(...) over (order by ..)" guarantee that result will be ordered (in case PARALLEL_ENABLE stragg).
I mean is there a guarantee that ODCIAggregateMerge will merge thread results in correct (sort-wise) order?
or it can works like this:
id thread stragg
1 1 1
2 1,2 1,2
3 3 3,1,2
4 3,4 3,4,1,2
5 5 3,4,1,2,5
6 5,6 3,4,1,2,5,6
Tom Kyte
April 17, 2009 - 9:51 am UTC

no, unless and until there is an order by on the STATEMENT, the rows in the result set can and will be returned in any order we feel like returning them.


But I'm not sure you are talking about the STATEMENT here - and given I have no idea what your example represents - I cannot really answer you...

order and ODCIAggregateMerge

Alex, April 17, 2009 - 11:30 am UTC

Ok. Let me ask this another way.

For example:
  1  select deptno,
  2  stragg(ename) over(partition by deptno order by ename) s,
  3  row_number() over(order by deptno,ename) rn
  4*  from emp
SQL> /

   DEPTNO S                                               RN
--------- ---------------------------------------- ---------
       10 CLARK                                            1
       10 CLARK,KING                                       2
       10 CLARK,KING,MILLER                                3
       20 ADAMS                                            4
       20 ADAMS,FORD                                       5
       20 ADAMS,FORD,JONES                                 6
       20 ADAMS,FORD,JONES,SCOTT                           7
       20 ADAMS,FORD,JONES,SCOTT,SMITH                     8
       30 ALLEN                                            9
       30 ALLEN,BLAKE                                     10
       30 ALLEN,BLAKE,JAMES                               11
       30 ALLEN,BLAKE,JAMES,MARTIN                        12
       30 ALLEN,BLAKE,JAMES,MARTIN,TURNER                 13
       30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD            14

14 rows selected.

Is there any probability that oracle return
       30 MARTIN,TURNER,WARD,ALLEN,BLAKE,JAMES            14
instead of
       30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD            14
because
"STRAGG can be done in parallel. So even though each thread will get its portion of rows sorted (which again can be debated), we have no clue in what order threads will finish. Therefore there is no guarantee ODCIAggregateMerge will merge thread results in correct (sort-wise) order."

I know, that "the only way to get sorted data....   use order by", but question about order in each row.

Tom Kyte
April 17, 2009 - 3:52 pm UTC

the data would be fed to the aggregate in order, the rows returned by the query would not be in any order.

So, stragg would see the data "in order of ename" for each piece - and aggregate merge would get two sorted pieces - if it just put them together - the resulting string might not be sorted anymore

ctl-f for

Ok, here is one that does "sorted" aggregates


that one would work, since it SORTS the data itself.

How to Eliminate in active & Duplicate Names

Sambi, June 09, 2009 - 2:06 pm UTC

Hi tom,

CREATE TABLE TNX_FRM
(
LOC_ID INTEGER,
FORM_CD VARCHAR2(50 BYTE),
CHANGE_NO INTEGER,
IS_ACTIVE VARCHAR2(1 BYTE)
);



Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'HOME', 1, 'Y');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'ALARM', 1, 'Y');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'ALARM', 1, 'Y');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'ADJUSTMENT', 1, 'Y');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'CONTRACT', 1, 'Y');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'DEDUCTIBLE', 1, 'Y');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'SAFEGUARD', 1, 'Y');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'SAFEGUARD', 2, 'N');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'ALARM', 2, 'N');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'CONTRACT', 2, 'N');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'CONTRACT', 3, 'Y');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'DEDUCTIBLE', 2, 'N');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'ADJUSTMENT', 2, 'N');
Insert into TNX_FRM
(LOC_ID, FORM_CD, CHANGE_NO, IS_ACTIVE)
Values
(55, 'DEDUCTIBLE', 3, 'Y');

COMMIT;



select loc_id,form_cd,change_no,is_active
from tnx_frm
order by loc_id,change_no,form_cd


LOC_ID FORM_CD CHANGE_NO IS_ACTIVE

55 ADJUSTMENT 1 Y
55 ALARM 1 Y
55 ALARM 1 Y
55 CONTRACT 1 Y
55 DEDUCTIBLE 1 Y
55 HOME 1 Y
55 SAFEGUARD 1 Y
55 ADJUSTMENT 2 N
55 ALARM 2 N
55 CONTRACT 2 N
55 DEDUCTIBLE 2 N
55 SAFEGUARD 2 N
55 CONTRACT 3 Y
55 DEDUCTIBLE 3 Y

14 rows selected.

I want the below output.


LOC_ID CHANGE_NO NAMES

55 1 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55 2 HOME
55 3 HOME,CONTRACT,DEDUCTIBLE

is_active indicates whether the name(form_cd) is active or has been dropped.The names get carried over
to the next change nos unless they are explicitly set in that change_no using is_active='N'

for eg: at change_no 2 all the other names(form_cd) except HOME are no more valid because they are explicitly set as 'N'

i can use stragg or or sys_connect_by_path trick for this kind of queries but

i) i have to get the distinct codes on loc_id level otherwise i get duplicates in the list
(eg : for change_no 3 i get HOME,CONTRACT,DEDUCTIBLE,CONTRACT,DEDUCTIBLE.)

ii) when i build the path a particular name may become invalid in the current change but the same name may be inherited from

a previous change.that case i may need to do a INSTR to see if the name is already coming from a previous change ..but that

is not going to work as i have multiple names de-activated in the same change ( eg: change_no 2)


i am using the following to insert into a table

select loc_id,change_no,(select wm_concat(distinct case when is_active='N' then NULL
else form_cd end )
from misp_temp.tnx_frm i
where i.change_no <= t.change_no
) names
from misp_temp.tnx_frm t
group by loc_id,change_no
order by change_no

and then loop thru each change to see if there are any actives and remove them..

I don't want to use wm_concat as this is not documented.

I prefer to get this in a query.

Thanks in Advance.

Tom Kyte
June 10, 2009 - 8:47 am UTC

... for eg: at change_no 2 all the other names(form_cd) except HOME are no more
valid because they are explicitly set as 'N'
...

does not make sense. what is special about "HOME", it comes before SAFEGUARD in the output above - what made it so darn special

you need to explain this much much better - and it would see to me that people didn't necessarily think about what they were doing with this data when they modelled the schema for it...

pretend you were giving the specifications for this to your mom, be that specific, that precise, using terminology everyone can understand (eg: why would you write: ... at change_no 2 all the other names(form_cd) ... - just to be purposely confusing? "Well, I think of the attribute as a name - but we really call it form_cd, so when I say 'name' you should think 'form_cd')

Sambi, June 10, 2009 - 12:09 pm UTC

Tom,

Sorry for the confusion..

I wanted the "names" column in the output sorted in alphabetical order(ascending).

LOC CHANGE NAMES
_ID _NO
----------------------------------
55 1 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55 2 HOME
55 3 CONTRACT,DEDUCTIBLE,HOME

This is owned by the ETL team and I Agree with you..it is not a good design.

The "names" column for change_no 2 should contain the comma separated list of distinct active form_cd values from all prior change_nos and the current change_no i.e <=current change_no

A form_cd is active if the most recent value of is_active flag for that form_cd is 'Y' in the change_nos <=current change_no
and it appears in the list.

Case i) For the input like this..

loc_id form_cd change_no is_active
-------------------------------------
55 ADJUSTMENT 1 Y
55 ALARM 1 Y
55 ALARM 1 Y
55 CONTRACT 1 Y
55 DEDUCTIBLE 1 Y
55 HOME 1 Y
55 SAFEGUARD 1 Y
55 ADJUSTMENT 2 Y
55 ALARM 2 Y
55 CONTRACT 2 Y
55 DEDUCTIBLE 2 Y
55 SAFEGUARD 2 Y
55 CONTRACT 3 Y
55 DEDUCTIBLE 3 Y


The output should be

LOC CHANGE NAMES
_ID _NO
----------------------------------------
55 1 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55 2 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD


Case ii) input

loc_id form_cd change_no is_active
-------------------------------------
55 ADJUSTMENT 1 Y
55 ALARM 1 Y
55 ALARM 1 Y
55 CONTRACT 1 Y
55 DEDUCTIBLE 1 Y
55 HOME 1 Y
55 SAFEGUARD 1 Y
55 DEDUCTIBLE 2 Y
55 SAFEGUARD 2 Y
55 CONTRACT 3 Y
55 DEDUCTIBLE 3 Y


output:

LOC CHANGE NAMES
_ID _NO
----------------------------------------
55 1 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55 2 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD


Case iii)

For the input like this..

loc_id form_cd change_no is_active
-------------------------------------
55 ADJUSTMENT 1 Y
55 ALARM 1 Y
55 ALARM 1 Y
55 CONTRACT 1 Y
55 DEDUCTIBLE 1 Y
55 HOME 1 Y
55 SAFEGUARD 1 Y
55 ADJUSTMENT 2 N
55 ALARM 2 N
55 CONTRACT 2 N
55 DEDUCTIBLE 2 N
55 SAFEGUARD 2 N
55 CONTRACT 3 Y
55 DEDUCTIBLE 3 Y


The output should be (for change_no 1 & 2)

LOC CHANGE NAMES
_ID _NO
----------------------------------------
55 1 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55 2 HOME

because in change_no 2 all the other "form_cd"s of pior change_nos except "HOME" are set as inactive using is_active='N'

and change_no 3 the form_cds "CONTRACT" and "DEDUCTIBLE" are made "active" by setting is_active='Y' and so
should contain all the active form_cds of change_nos <=3.

Output (for chagne_no:3 )

LOC_ID CHANGE_NO NAMES
----------------------------------------
55 3 CONTRACT,DEDUCTIBLE,HOME

so the complete output will be

LOC CHANGE NAMES
_ID _NO
----------------------------------------
55 1 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
55 2 HOME
55 3 CONTRACT,DEDUCTIBLE,HOME

Hope this is clear. Thanks in advance.

Tom Kyte
June 10, 2009 - 4:19 pm UTC

ops$tkyte%ORA10GR2> with
  2  -- we need all of the distinct form_cd values, we'll outer join to this
  3  -- set to "fill in the missing values".  Basically, we need to de-sparsify
  4  -- the data, make sure every loc_id/change_no has all of the form_cd values
  5  distinct_form_cd
  6  as
  7  (
  8  select distinct form_cd
  9    from tnx_frm
 10  ),
 11  -- we also need to get rid of duplicates.  I am making the (reasonable)
 12  -- assumption that if in a given loc_id, change_no, form_cd group - there
 13  -- is any row with a value of "Y", it is the "relevant" one - eg: if in a
 14  -- group there is a "Y" and and "N" row, we'll keep the "Y" row (since Y>N)
 15  distinct_tnx_frm
 16  as
 17  (
 18  select loc_id, change_no, form_cd, max(is_active) is_active
 19    from tnx_frm
 20   group by loc_id, change_no, form_cd
 21  ),
 22  -- now we need to fill in the gaps, the partitioned outer join (10g)
 23  -- does that for us.  We break the tnx_frm data up by loc_id,change_no
 24  -- and for every group we get - we outer join to the distinct form_cds
 25  -- now, every loc_id, change_no has EVERY form_cd.  We use LAST_VALUE
 26  -- then to carry down the last observed value of IS_ACTIVE across all
 27  -- of the groups - so if the first change_no in a loc_id has "Y" and the
 28  -- third change_no in a loc_id has "N" and the fifth has "Y" again - then
 29  -- groups 1, 2 = "Y", 3, 4 = "N", and 5 on up will have "Y"
 30  carry_down_is_active
 31  as
 32  (
 33  select distinct_form_cd.form_cd, distinct_tnx_frm.loc_id, distinct_tnx_frm.change_no,
 34         last_value( distinct_tnx_frm.is_active ignore nulls )
 35              over (partition by loc_id,distinct_form_cd.form_cd order by change_no) isact
 36    from distinct_tnx_frm partition by (loc_id,change_no)
 37            right outer join distinct_form_cd
 38              on (distinct_tnx_frm.form_cd = distinct_form_cd.form_cd)
 39  ),
 40  -- Now we just pull off the active records in each loc_id/change_no pair
 41  -- and then assign a row_number after sorting by form_cd
 42  data
 43  as
 44  (
 45  select loc_id, form_cd, change_no,
 46         row_number() over (partition by loc_id, change_no order by form_cd) rn
 47    from carry_down_is_active
 48         where isact = 'Y'
 49  )
 50  -- so we can string aggregate using a connect by trick...
 51  -- ta-dah, we are done
 52  select loc_id, change_no,
 53         substr( max(sys_connect_by_path( form_cd, ',' )), 2 ) form_cds
 54    from data
 55   start with rn = 1
 56  connect by prior rn+1 = rn and prior loc_id = loc_id and prior change_no = change_no
 57   group by loc_id, change_no
 58   order by 1, 2
 59  /

    LOC_ID  CHANGE_NO FORM_CDS
---------- ---------- ----------------------------------------------------------------------
        55          1 ADJUSTMENT,ALARM,CONTRACT,DEDUCTIBLE,HOME,SAFEGUARD
        55          2 HOME
        55          3 CONTRACT,DEDUCTIBLE,HOME



Sambi, June 11, 2009 - 10:45 am UTC

Thanks tom.That is really great..
This is exactly what i wanted..
Thank you once again..

row concatenation into a clob

dxl, June 30, 2009 - 10:41 am UTC

This thread is very useful for concatenating smallish amounts of column data into a single entry in a row... however what technique would you employ if you had to concatenate many thousands of column data??

I am trying to come up with a way of taking a column in a table with many thousands of rows and concatenating the rows together, maybe using an end of line delimiter between rows, and then passing this data into a clob/blob so that a procedure can stream this data back to a jdbc client.

So the basic principle is how to stream the contents of a column in a table back to a client via a packaged procedure (probably using clobs or blobs)??

The only way i can think of doing this is processing a batch of rows at a time and appending them to the clob out parameter of a procedure so that the jdbc client can stream this data. Would a pipelined function help here to start populating the stream?

Any thoughts on the correct technique would be helpful.
Tom Kyte
July 06, 2009 - 6:48 pm UTC

... however what technique would you employ if you
had to concatenate many thousands of column data??
....

I would raise the question seriously "what the heck for - do you really mean to be using XML perhaps in this case, a big comma delimited string of thousands of rows is useless"


... and then passing this data into a clob/blob so
that a procedure can stream this data back to a jdbc client.
....

why??!?!?!? so the jdbc client can parse the string??? why would you not just PASS A RESULT SET BACK TO THEM????

Thanks! Tom - one review more pls

suresh, July 21, 2009 - 8:05 am UTC

Hi Tom,

I am writing a query to retrieve the shift dates from a table in the following manner.

Expected result :-
Month Shift-1 Shift-2
Jan2009 01,02,03 7,9,11,23,34
Feb2009 01,02,03 7,9,11,23,34

The query i am using for same
==================================
SELECT to_char(shift_date,'MONYYYY') ,
LTRIM(MAX( SYS_CONNECT_BY_PATH (to_char(shift_date,'DD'), ',')),',') shift-1
FROM
(SELECT peoplesoft_id,shift_date,shift_codes,
row_number() OVER ( PARTITION BY to_char(shift_date,'MONYYYY') ORDER BY shift_date) rn
FROM shift_plan where peoplesoft_id=1234567 )
CONNECT BY to_char(shift_date,'MONYYYY') = PRIOR to_char(shift_date,'MONYYYY')
AND rn = PRIOR rn+1
START WITH rn =1
GROUP BY to_char(shift_date,'MONYYYY');

**** But the problem here is where to put the condition for each shift type ****
1) If i put the condition in row over partition statement then i get the result for only for that shift i.e shift-1 or something.
2) I have tried to use case statement to filter the shift codes above in ltrim place but the condition itself not working
ex:- ltrim(max(case when shift_codes in ('W1','S1','O1') then LTRIM(MAX( SYS_CONNECT_BY_PATH (to_char(shift_date,'DD'), ',')),',') end

Please help.

-Thanks
Suresh
Tom Kyte
July 24, 2009 - 12:08 pm UTC

no create
no inserts
no example

I don't even bother to read.... no idea if we can answer your question, but unless you make it easy for us to answer it - no one will look at it.

Thanks! one more review pls

Suresh, July 25, 2009 - 7:19 am UTC

Hi Tom,

Apologies for not clear enough.

Here is the table.
Table Shift_Codes
=================
Shift_Code Shift_Descriptio
W GeneralShift
W1 Shift1
W2 Shift2
W3 Shift3
...
and so on upto 30 shiftcodes.


Shift_Plan
============
Psoftid Date ShiftCode
1234567 01-Jan-09 W1
1234567 02-Jan-09 W1
1234567 03-Jan-09 W2
...
1234567 26-Jan-09 H
1234567 01-Feb-09 W1
1234567 02-Feb-09 W2
1234567 06-Feb-09 W3
..

I am generating a report of shift wise dates like below by selecting the psoftid Employee can view his shifts

Month Shift-1 Shift-2 Shift-3 Holidays Weekends
Jan-09 01,02 03 - 26 14,15,21,22
Feb-09 01, 02 06 - 14,15,21,22


For this i have written the below query and this works well upto shift-1
=======================================
SELECT to_char(shift_date,'MONYYYY') ,
LTRIM(MAX( SYS_CONNECT_BY_PATH (to_char(shift_date,'DD'), ',')),',') shift-1
FROM
(SELECT peoplesoft_id,shift_date,shift_codes,
row_number() OVER ( PARTITION BY to_char(shift_date,'MONYYYY') ORDER BY shift_date) rn
FROM shift_plan where peoplesoft_id=1234567 and shift_codes in ('W1')
CONNECT BY to_char(shift_date,'MONYYYY') = PRIOR to_char(shift_date,'MONYYYY')
AND rn = PRIOR rn+1
START WITH rn =1
GROUP BY to_char(shift_date,'MONYYYY');
Output
======
Month Shift-1
Jan-09 01,02
Feb-09 01,

But how to put condition for others columns shift-2,shift-3,Holidays,Weekends in my above query.


I tried to keep the condition like below.

SELECT to_char(shift_date,'MONYYYY') ,
LTRIM(MAX( case when shift_codes in ('W1') then SYS_CONNECT_BY_PATH (to_char(shift_date,'DD'), ','))end as "Shift-1",',') FROM
(SELECT peoplesoft_id,shift_date,shift_codes,
row_number() OVER ( PARTITION BY to_char(shift_date,'MONYYYY') ORDER BY shift_date) rn
FROM shift_plan where peoplesoft_id=1234567 )
CONNECT BY to_char(shift_date,'MONYYYY') = PRIOR to_char(shift_date,'MONYYYY')
AND rn = PRIOR rn+1
START WITH rn =1
GROUP BY to_char(shift_date,'MONYYYY');

But the i got similar output, means case not working
Month Shift-1
Jan-09 01,02,03,...26,14,15,21,22..31
Feb-09 01,02,..06...14,15,21,22...28
....


Hope this time I am clear enough. Please advise.

-Thanks
Suresh



Tom Kyte
July 26, 2009 - 7:48 am UTC

no CREATE
no INSERTS
no look
never
not then
not now
not tomorrow


I honestly have no idea if your question can be answered - but only because I did not read it, and will not read it unless and until it includes a CREATE TABLE statement I CAN RUN, and insert into statements that I CAN RUN.

You know, sort of like I always give you guys...

STRAGG with "distinct"

Joe, September 24, 2009 - 9:15 am UTC

Tom,

I am using the STRAGG function to concatenate a column from multiple rows but I only want the distinct values. So I did a "STRAGG(distinct col1)" and this works great. However, when I put this SQL in my procedure it fails to compile with "PL/SQL: ORA-30482: DISTINCT option not allowed for this function".

Can you tell me why this works when I run the sql by itself and fails whenever I put it in a procedure or BEGIN/END block? Is there a way to fix it? I guess I could first select the distinct values in a sub-query and then use stragg but was wondering if there is another way.

Below is a sample you can run in a command window that show that it works by itself and fails inside a begin/end block.


create table t (col1 varchar2(20));

-- This inserts works here
insert into t
  select stragg(distinct dummy)
    from dual;


-- When inside a procedure or begin/end block, the insert fails
begin
  insert into t
    select stragg(distinct dummy)
      from dual;
end;
/


Thanks,
Joe (Oracle 10g)
Tom Kyte
September 29, 2009 - 8:12 am UTC

I filed a bug on this as it reproduces in 9i-11gr2. It should work

You can either

a) use distinct AND THEN stragg as you said...
b) use dynamic sql, hide the construct from plsql
c) use a view, hide the construct from plsql

Sangeetha, June 17, 2010 - 3:46 am UTC

HI Tom,

I was trying to use stragg in one of my queries. My idea is to get a concatenated list of all the workorders against an aircraft_partno,aircraft_serialno, flight_check_no.

SELECT stragg (workorderno) wo, aircraft_partno,
aircraft_serialno, flight_check_no
FROM wip_details
GROUP BY aircraft_partno, aircraft_serialno, flight_check_no;

this is my query. my STRING variable in stragg_type is of size 32767 .I have around 968 workorders in that table against each combination and it's length is 12 characters. So I am expecting that the stragg should have the capacity to hold 968*12 characters at max.but it is throwing the error "ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at "VISMP.STRAGG_TYPE", line 28
"
I tried to make the variable STRING variable in stragg_type as "CLOB" still I am getting the same error.

Am I missing anything??

I have noted one more thing,

I just rewrote the query to understand, at which length it breaks. I could see that the inner query works OK even with 500 records, but when I execute the whole query it breaks. What could be the problem??


SELECT wo, LENGTH (wo)
FROM (SELECT stragg (workorderno) wo, aircraft_partno,
aircraft_serialno, flight_check_no
FROM wip_details
WHERE source_type IN ('NR', 'JV')
AND (aircraft_partno, aircraft_serialno, flight_check_no) IN (
SELECT aircraft_partno, aircraft_serialno,
flight_check_no
FROM (SELECT COUNT (*) cnt, aircraft_partno,
aircraft_serialno, flight_check_no
FROM wip_details
GROUP BY aircraft_partno,
aircraft_serialno,
flight_check_no
ORDER BY COUNT (*) DESC)
WHERE cnt < 500)
GROUP BY aircraft_partno, aircraft_serialno, flight_check_no)
ORDER BY LENGTH (wo) DESC

I dont know whether this a new question or a clarification.
Could you please answer, when you get time.

Thank you very much for your time and advice
Regards
Sangeetha



Tom Kyte
June 22, 2010 - 12:30 pm UTC

you do not show your work, I don't know what line 28 of your code might be - but you'll be limited to a varchar2(4000) for the interface to SQL. If you are using a clob, you'll want to use a clob in the entire interface - not just at the end which is what I suspect you did.


stragg/pivot/analytics ?

Ravi B, July 22, 2010 - 4:25 pm UTC

Hi Tom,

Could you please help me accomplish this.

I have two tables master-detail
I have a requirement to append detail records as columns to master record. The number of columns depends upon number of values of detail table. I couldn't accomplish this through pivot or stragg.

Example:
Master table emp, detail table emp_support
Each employee could be of level1 or level2 or level3 support; or combination of these support levels.

I wanted an output like this:

empno ename support_level1 support_leve2 support_leve3
-----------------------------------------------------------------
7369 SMITH level1 level2
7499 ALLEN level1 level2 level3
7521 WARD level2 level3
7566 JONES level3
7654 MARTIN level2 level3
7698 BLAKE level1 level2 level3
7782 CLARK level3
7788 SCOTT level2
7839 KING level2
7844 TURNER level1
7876 ADAMS level1 level2 level3
7900 JAMES level1
7902 FORD level1
7934 MILLER level2 level3

create table emp_support (empno number, support_level varchar2(10));

insert into emp_support values(7369,'level1');
insert into emp_support values(7369,'level2');
insert into emp_support values(7499,'level1');
insert into emp_support values(7499,'level2');
insert into emp_support values(7521,'level3');
insert into emp_support values(7566,'level3');
insert into emp_support values(7654,'level2');
insert into emp_support values(7654,'level3');
insert into emp_support values(7698,'level1');
insert into emp_support values(7698,'level2');
insert into emp_support values(7698,'level3');
insert into emp_support values(7782,'level3');
insert into emp_support values(7788,'level2');
insert into emp_support values(7839,'level2');
insert into emp_support values(7844,'level1');
insert into emp_support values(7786,'level1');
insert into emp_support values(7786,'level2');
insert into emp_support values(7786,'level3');
insert into emp_support values(7900,'level1');
insert into emp_support values(7902,'level1');
insert into emp_support values(7934,'level2');
insert into emp_support values(7934,'level3');
commit;

The table emp could have anywhere between 5 mil - 10 mil records.
Tom Kyte
July 23, 2010 - 9:37 am UTC

A query needs to have a fixed number of columns at parse time.

You would need to

a) query to find out the max number of columns (or just set a reasonable default max)

b) then build a query that uses the output of (a) to select out a fixed number of columns.


There is no other way - if you need them as individual columns.

stragg/pivot/analytics ?

Ravi B, July 23, 2010 - 11:09 am UTC

Tom,

What if suppose i set the default to 5. How would I go about writing this query.

Thanks for your help.
Tom Kyte
July 23, 2010 - 12:37 pm UTC

ops$tkyte%ORA10GR2> select d.deptno, d.dname,
  2         max( decode( rn, 1, e.ename ) ) ename1,
  3         max( decode( rn, 2, e.ename ) ) ename2,
  4         max( decode( rn, 3, e.ename ) ) ename3,
  5         max( decode( rn, 4, e.ename ) ) ename4,
  6         max( decode( rn, 5, e.ename ) ) ename5,
  7         max( decode( rn, 6, e.ename ) ) ename6
  8    from scott.dept d,
  9         (select ename, deptno, row_number() over (partition by deptno order by ename) rn
 10                from scott.emp ) e
 11   where e.deptno = d.deptno
 12   group by d.deptno, d.dname
 13   order by d.deptno, d.dname
 14  /

    DEPTNO DNAME          ENAME1     ENAME2     ENAME3     ENAME4     ENAME5     ENAME6
---------- -------------- ---------- ---------- ---------- ---------- ---------- ----------
        10 ACCOUNTING     CLARK      KING       MILLER
        20 RESEARCH       ADAMS      FORD       JONES      SCOTT      SMITH
        30 SALES          ALLEN      BLAKE      JAMES      MARTIN     TURNER     WARD


Pivot - new in 11g - would work as well in 11g.

Thanks!

Ravi B, July 23, 2010 - 2:12 pm UTC

Thank you very much Tom!

" to concatenate many thousands of column "

Jonny, October 07, 2010 - 7:56 am UTC

Hello!
I'm trying to concatenate many thousands of column
data to get md5 hash for a whole (ordered) column.
With ODCI method, im getting "ORA-22813: operand value exceeds system limits".
Could you advice something?
Tom Kyte
October 11, 2010 - 10:15 am UTC


the result of a concatenate of strings will be limited to 4000 character in SQL, 32k in PLSQL outside of SQL. You'd have to use a CLOB type - you would code a procedure to build the clob and then hash it.

You give no example of your current approach so we cannot comment on it - but what you'll need to do is write a stored procedure that "builds" the clob and then hashes it.


What is your goal - what is this hash to be used for - what are you trying to ultimately do (maybe we can suggest a better way, concatenating "thousands" of columns doesn't see to make sense right off the top of it)

"to concatenate many thousands of column"

A reader, March 30, 2011 - 4:32 am UTC

HI

I have exactly the same problem as Johny, I have to create a hash across many fields in a DB, this is to match a similar hash created on the appserver, and of course ordering in the concat string is also important. The reason is simple, comparison of loaded data against transmitted data in the wider system. And of course, this is a large project and the specification of this will not change now!

To implement a solution I have taken an approach very similar to the one further up this string, modifying the aggregation function to output a clob, and internally using a nested table type to store the values as they come in from the iterate. Problem is that I run into exactly the same problem as Johny, ora-22813.

The interesting thing is that the error is generated sometime after the iterate method has run, the table is extended and teh data added, the error occurs after this. Also I have tried adding similar amounts of data outside the aggregate function to the same table type and all is fine, the limit seems to occur only because of the object.

The original code for this was written using the 11g listagg function which is limited to 4k characters. I was looking for a solution which would have a similar syntax for this reason.

Cheers

Bob
Tom Kyte
March 30, 2011 - 8:14 am UTC

I'll never understand this "requirement" unless it comes from the CPU manufacturer.

no code, no example, no look.

give me a create of the code
a create of a table
an insert into that table

to demonstrate the issue.

to concatenate many thousands of column

A reader, March 30, 2011 - 6:00 am UTC

And there's more...

As an experiment I created a little testbed to call the agg object outside of SQL, guess what? It works fine!!!

Called from SQL the maximum length of the final output clob is in the order of 18K, calling it from the test bed it generates a clob of 500K+. It seems that the limitation is someting in the SQL engine, not as I originally thought the object wrapper.

Bob

Test code
declare
o typ_agg;
res number;
c clob;
begin
o:=typ_agg(null, null); --not sure which of these 2 lines is needed, the constructor seems not to call the initialize!
res := typ_agg.ODCIAggregateInitialize(o);
for r in (SELECT a.num || ';' ||a.data d --table just contains a lot of text in data and a number
FROM ag_test a
order by num desc)
loop
res := o.ODCIAggregateIterate(to_char(r.d));
end loop;
res := o.ODCIAggregateTerminate(c,0);
dbms_output.put_line('clob length ' || dbms_lob.getlength(c));
end;
Tom Kyte
March 30, 2011 - 8:15 am UTC

complete examples only - from start to finish, that can be run on a newly installed otherwise empty database.

and again

A reader, March 30, 2011 - 10:03 am UTC

Here you go a contained example - frist select at the end fails, second suceeds...

Cheers

Bob

create table tst_agg as
(Select rownum idx, 'just a string ' || rownum data
from dual
connect by rownum <= 4000)
/
CREATE OR REPLACE
type temp_agg_type as object( idx number(18), line varchar2(10000));
/

CREATE OR REPLACE
type temp_tab is table of temp_agg_type;
/

CREATE OR REPLACE
TYPE typ_agg AS OBJECT
(
tbl temp_tab,

STATIC FUNCTION ODCIAggregateInitialize(P_SCTX IN OUT TYP_AGG)
RETURN NUMBER,

MEMBER FUNCTION ODCIAggregateIterate(SELF IN OUT TYP_AGG,
P_VALUE IN varchar2 )
RETURN NUMBER,

MEMBER FUNCTION ODCIAggregateTerminate(SELF IN TYP_AGG,
P_RETURNVALUE OUT clob,
P_FLAGS IN NUMBER)
RETURN NUMBER,

MEMBER FUNCTION ODCIAggregateMerge(SELF IN OUT TYP_AGG,
P_CTX2 IN TYP_AGG)
RETURN NUMBER
);
/

CREATE OR REPLACE
TYPE BODY typ_agg IS

STATIC FUNCTION ODCIAggregateInitialize(P_SCTX IN OUT TYP_AGG)
RETURN NUMBER IS
tb temp_tab;
BEGIN
tb := temp_tab();
P_SCTX := TYP_AGG(tb);
RETURN ODCIConst.Success;
END;


MEMBER FUNCTION ODCIAggregateIterate(SELF IN OUT TYP_AGG,
P_VALUE IN varchar2 )
RETURN NUMBER IS
BEGIN
tbl.extend(1);
--in our case the first elements in the concat is the ordering number
tbl(tbl.count) := temp_agg_type(to_number(substr(p_value,1, instr(p_value, ';') - 1)), p_value);
RETURN ODCIConst.Success;
END;

MEMBER FUNCTION ODCIAggregateTerminate(SELF IN TYP_AGG,
P_RETURNVALUE OUT clob,
P_FLAGS IN NUMBER)
RETURN NUMBER IS
c clob := null;
tmp varchar2(32000) := null;
BEGIN
for r in (select line
from table(tbl)
order by idx)
loop
if c is null and tmp is null
then
tmp := r.line;
else
if length(tmp || r.line) > 32000
then
c := c || tmp;
tmp := '';
end if;
tmp := tmp || ';'|| r.line;
end if;
end loop;
P_RETURNVALUE := c || tmp;
RETURN ODCIConst.Success;
END;

MEMBER FUNCTION ODCIAggregateMerge(SELF IN OUT TYP_AGG,
P_CTX2 IN TYP_AGG)
RETURN NUMBER IS
BEGIN
--not implemented yet
dbms_output.put_line('Merge ');
RETURN ODCIConst.Success;
END;
END;
/

CREATE OR REPLACE
FUNCTION agg (p_input VARCHAR2)
RETURN clob
--PARALLEL_ENABLE
AGGREGATE USING TYP_AGG;
/
select agg(idx || ';' || data)
from tst_agg
/
select agg(idx || ';' || data)
from tst_agg
where idx <= 1000
/



Tom Kyte
April 12, 2011 - 9:33 am UTC

just got back from an extended trip - thanks for posting the solution you worked out.

It does seem to be a limit on the size of the data in the collection element itself - with your data, once you hit about 1009 elements or so, the agg-iterate routine fails with ora-22813.

Moving to a package works but should be considered fragile. Your init routine must zero out the package variable - meaning you can only use this aggregate routine ONCE in a sql statement and probably NOT in paralle.

solution

A reader, April 06, 2011 - 5:42 am UTC

For anyone else stuck with this problem, put the nested table in a package, not in the aggregate type, this allows it to grow beyond 30K. The problem then is the speed of the clobs...

Bob

Pauline, April 14, 2011 - 12:16 pm UTC

Tom,
It looks like this URL is not available now.
http://asktom.oracle.com/pls/ask/search?p_string=stragg+clob

Coult you tell the accessable URL?

When we want to select distinct value for clob column, we may do something like, for example:

create table t (id number(8), cl clob);
insert into t values(1,'abc');
insert into t values(2,'efg');
commit;

--get result as
CL
--------------------------------------------------------------------------------
abc
efg

But how can I select id which corresponding to each clob row
in big table while select distinct value on clob column?

Thanks

Tom Kyte
April 14, 2011 - 5:29 pm UTC

http://asktom.oracle.com/pls/asktom/asktom.search?p_string=stragg+clob



you cannot distinct clobs, it will not work.

Pauline, April 14, 2011 - 12:20 pm UTC

Sorry, I cut the my SQL to only select distinct value on CLOB which is
select cl
from (select t1.cl,
row_number() over (partition by dbms_lob.compare (t1.cl, t2.cl) order by null) rn
from t t1, t t2
where dbms_lob.compare (t1.cl, t2.cl) != 0)
where rn = 1;

Now I need to select id with distinct value on CLOB.

Thanks
Tom Kyte
April 14, 2011 - 5:30 pm UTC

I do not get this at all. You really want to cartesian join t1 to t2, and then compare every lob in t1 with every lob in t2?

really?

Pauline, April 18, 2011 - 9:28 am UTC

Tom,
Actually it should have the where cause to indicate which id's lob I want to compare.
My real purpose is want to compare SQL_text in 11g Oracle SQL plan baseline and get the signature for them. Here is my
sql
set pagesize 0 long 30000 head off
select sql_text
from (select t1.sql_text,
row_number() over (partition by dbms_lob.compare (t1.sql_text, t2.sql_text) order by null) rn
from DBA_SQL_PLAN_BASELINES t1, DBA_SQL_PLAN_BASELINES t2
where dbms_lob.compare (t1.sql_text, t2.sql_text) != 0
and t1.SQL_HANDLE ='SQL_e711aaab2ce5195e')
where rn = 1;

It returns 2 rows to me. How can I use query to select distinct SIGNATURE and sql_text by t1.SQL_HANDLE ?

Thanks.



Tom Kyte
April 18, 2011 - 10:46 am UTC

signature is defined as "Unique SQL identifier generated from normalized SQL text"

I'm still confused why you are self joining?


and why you are comparing sql_text? and not signature?

just generate the set of sql_handles/signatures you want - and use that in a subquery as an IN clause to get the data you want from that table.

Pauline, April 19, 2011 - 9:32 am UTC

Tom,
Let me go back to make my question simpler ( forget about query sql plan baseline) - how to query distict CLOB with ID
( I mean dispaly PK / ID and clob text ) by some condition in where clause? Can you give some example or links?

Thanks
Tom Kyte
April 19, 2011 - 6:55 pm UTC

There is no way to "distinct" a clob, you'd need to use something like a md5 checksum (see dbms_crypto) on the clob to get something "distinctable"

question about row_number()

Ian, May 18, 2011 - 4:24 pm UTC

Hi Tom,

Thanks for the awesome site. I need help producing a query:


create table t (id number, name varchar2(30), change_date date, colour varchar2(30) );

insert into t values (1,'ben', to_date('2011-01-01', 'yyyy-mm-dd'), 'blue');
insert into t values (2,'ben', to_date('2011-01-15', 'yyyy-mm-dd'), 'blue');
insert into t values (3,'ben', to_date('2011-02-01', 'yyyy-mm-dd'), 'blue');
insert into t values (4,'ben', to_date('2011-02-10', 'yyyy-mm-dd'), 'red');
insert into t values (5,'ben', to_date('2011-03-01', 'yyyy-mm-dd'), 'blue');
insert into t values (6,'ben', to_date('2011-04-01', 'yyyy-mm-dd'), 'green');
insert into t values (7,'ben', to_date('2011-05-01', 'yyyy-mm-dd'), 'green');
insert into t values (8,'ben', to_date('2011-05-10', 'yyyy-mm-dd'), 'black');
insert into t values (9,'ben', to_date('2011-06-01', 'yyyy-mm-dd'), 'blue');
insert into t values (10,'jerry', to_date('2011-01-10', 'yyyy-mm-dd'), 'blue');
insert into t values (11,'jerry', to_date('2011-01-31', 'yyyy-mm-dd'), 'blue');
insert into t values (12,'jerry', to_date('2011-02-10', 'yyyy-mm-dd'), 'blue');
insert into t values (13,'jerry', to_date('2011-02-28', 'yyyy-mm-dd'), 'blue');
insert into t values (14,'jerry', to_date('2011-03-10', 'yyyy-mm-dd'), 'blue');
insert into t values (15,'jerry', to_date('2011-04-10', 'yyyy-mm-dd'), 'blue');

select * from t
order by name, change_date;


ID NAME CHANGE_DATE COLOUR
1 ben 1/1/2011 blue
2 ben 1/15/2011 blue
3 ben 2/1/2011 blue
4 ben 2/10/2011 red
5 ben 3/1/2011 blue
6 ben 4/1/2011 green
7 ben 5/1/2011 green
8 ben 5/10/2011 black
9 ben 6/1/2011 blue
10 jerry 1/10/2011 blue
11 jerry 1/31/2011 blue
12 jerry 2/10/2011 blue
13 jerry 2/28/2011 blue
14 jerry 3/10/2011 blue
15 jerry 4/10/2011 blue


I want to select rows for each person, for each change in colour. So the result set would be:


ID NAME CHANGE_DATE COLOUR
1 ben 1/1/2011 blue
4 ben 2/10/2011 red
5 ben 3/1/2011 blue
6 ben 4/1/2011 green
8 ben 5/10/2011 black
9 ben 6/1/2011 blue
10 jerry 1/10/2011 blue


I'm struggling with specifying the appropriate window to partition by. I need to preserve the date order and THEN partition by name and colour.

select *
from
(select t.*
,row_number() over (partition by name, colour order by change_date) row_num
from t)
where row_num = 1
order by name, change_date

Thanks Tom!

Tom Kyte
May 18, 2011 - 11:48 pm UTC

ops$tkyte%ORA11GR2> select *
  2    from (
  3  select id, name, change_date, colour,
  4         lag(colour) over (partition by name order by change_date) last_colour
  5    from t
  6         )
  7   where decode( colour, last_colour, 1, 0 ) = 0
  8   order by name, change_date
  9  /

        ID NAME  CHANGE_DA COLOUR   LAST_COL
---------- ----- --------- -------- --------
         1 ben   01-JAN-11 blue
         4 ben   10-FEB-11 red      blue
         5 ben   01-MAR-11 blue     red
         6 ben   01-APR-11 green    blue
         8 ben   10-MAY-11 black    green
         9 ben   01-JUN-11 blue     black
        10 jerry 10-JAN-11 blue

7 rows selected.

Simple when you know how...

Ian, May 19, 2011 - 9:28 am UTC

Thanks

rows to columns

parag, July 06, 2011 - 7:08 am UTC

Dear Tom,


We have a table which structure is as below;

drop table t;


create table t
(
a varchar2(2),
b timestamp not null,
c varchar2(15),
d number(15)
)
/


insert into t values ( 'A1', systimestamp, 'C1', 0);
insert into t values ( 'A1', systimestamp, 'C1', 0);
insert into t values ( 'A1', systimestamp, 'C1', 1);
insert into t values ( 'A1', systimestamp, 'C1', 1);

insert into t values ( 'A2', systimestamp, 'C2', 0);
insert into t values ( 'A2', systimestamp, 'C2', 0);
insert into t values ( 'A2', systimestamp, 'C2', 1);
insert into t values ( 'A2', systimestamp, 'C2', 1);

insert into t values ( 'A3', systimestamp, 'C3', 0);
insert into t values ( 'A3', systimestamp, 'C3', 0);
insert into t values ( 'A3', systimestamp, 'C3', 1);
insert into t values ( 'A3', systimestamp, 'C3', 1);

insert into t values ( 'A4', systimestamp, 'C4', 0);
insert into t values ( 'A4', systimestamp, 'C4', 0);
insert into t values ( 'A4', systimestamp, 'C4', 1);
insert into t values ( 'A4', systimestamp, 'C4', 1);

insert into t values ( 'A5', systimestamp, 'C5', 0);
insert into t values ( 'A5', systimestamp, 'C5', 0);
insert into t values ( 'A5', systimestamp, 'C5', 1);
insert into t values ( 'A5', systimestamp, 'C5', 1);


select * from t;


This table has multiple records as below;

R:IVI6WDD2 > select * from t;

A B C D
-- --------------------------------------------------------------------------- --------------- ----------
A1 06-JUL-11 04.21.07.423326 PM C1 0
A1 06-JUL-11 04.21.07.426255 PM C1 0
A1 06-JUL-11 04.21.07.427802 PM C1 1
A1 06-JUL-11 04.21.07.428753 PM C1 1

A2 06-JUL-11 04.21.07.430097 PM C2 0
A2 06-JUL-11 04.21.07.431042 PM C2 0
A2 06-JUL-11 04.21.07.432413 PM C2 1
A2 06-JUL-11 04.21.07.433357 PM C2 1

A3 06-JUL-11 04.21.07.434883 PM C3 0
A3 06-JUL-11 04.21.07.435829 PM C3 0
A3 06-JUL-11 04.21.07.437157 PM C3 1
A3 06-JUL-11 04.21.07.438084 PM C3 1

A4 06-JUL-11 04.21.07.439397 PM C4 0
A4 06-JUL-11 04.21.07.440471 PM C4 0
A4 06-JUL-11 04.21.07.441916 PM C4 1
A4 06-JUL-11 04.21.07.442882 PM C4 1

A5 06-JUL-11 04.21.07.444288 PM C5 0
A5 06-JUL-11 04.21.07.445270 PM C5 0
A5 06-JUL-11 04.21.07.446619 PM C5 1
A5 06-JUL-11 04.21.07.447550 PM C5 1

20 rows selected.


The question is, we need output in the foll. format ( all data in a single row ) eg.


# Columns output as below;

Columns: C D B C D B C D B C D

Output: C1, 1, 06-JUL-11 04.21.07.427802 PM, C2, 1, 06-JUL-11 04.21.07.438084 PM, C3, 1, 06-JUL-11 04.21.07.442882 PM, C4, 1,

B C D B
06-JUL-11 04.21.07.446619 PM, C5, 1, 06-JUL-11 04.21.07.447550 PM


Condition for where clause is

value of column b i.e. timestamp will be the latest where d=1



Please suggest us as to how we can achieve this in both 10g and 11g

Thanks.



Tom Kyte
July 06, 2011 - 8:15 am UTC

you would have to

a) count how many columns you'll ultimate have - eg: select count(*) * 3 from table where d = 1.

b) dynamically construct a query that uses that information, the query would look something like:

select max( decode( r, 1, c ) ), 
       max( decode( r, 1, d ) ), 
       max( decode( r, 1, b ) ),
       max( decode( r, 2, c ) ), 
       max( decode( r, 2, d ) ), 
       max( decode( r, 2, b ) ),
       ... for all other r values ...
       max( decode( r, N, c ) ), 
       max( decode( r, N, d ) ), 
       max( decode( r, N, b ) )
from (select c,d,b,row_number() over (order by whatever...) r from t where d=1)


we need to know how many columns something will have at PARSE time, you'll have to figure out that number and then build your query dynamically based on that




OK

Kumar, September 15, 2011 - 6:34 am UTC

Hi Tom,
Any simple way to convert the column ENAME values
ALLEN
BLAKE
JAMES
MARTIN
TURNER
to
ALLEN,BLAKE,JAMES,MARTIN,TURNER?

Thanks for your time.
Tom Kyte
September 15, 2011 - 7:55 am UTC

ops$tkyte%ORA11GR2> select listagg( ename, ',' ) within group (order by ename) from scott.emp;

LISTAGG(ENAME,',')WITHINGROUP(ORDERBYENAME)
-------------------------------------------------------------------------------
ADAMS,ALLEN,BLAKE,CLARK,FORD,JAMES,JONES,KING,MARTIN,MILLER,SCOTT,SMITH,TURNER,
WARD



in 11g


stragg in 10g and 9i works.

strange results of STRAGG in 11g

Klaus Eckhardt, November 15, 2011 - 5:06 am UTC

Hi Tom,

we adapted the T_String_AGG a litle bit to order the concated strings. our t_sting_agg is now

CREATE OR REPLACE TYPE BODY SYS.t_string_agg IS

STATIC FUNCTION ODCIAggregateInitialize(sctx IN OUT t_string_agg) RETURN NUMBER IS
BEGIN
sctx := t_string_agg(NULL);
RETURN ODCIConst.Success;
END;

MEMBER FUNCTION ODCIAggregateIterate(SELF IN OUT t_string_agg
,VALUE IN VARCHAR2) RETURN NUMBER IS
/* Ordered sum implemented at 10.07.2009 */
v_arglist arg_list := arg_list();
BEGIN
SELF.g_string := SELF.g_string || '/' || VALUE;
BEGIN
SELECT to_number(column_value) BULK COLLECT INTO v_arglist FROM TABLE(Varchar2ArgList(SELF.g_string, '/')) ORDER BY 1;
EXCEPTION
WHEN OTHERS THEN
SELECT * BULK COLLECT INTO v_arglist FROM TABLE(Varchar2ArgList(SELF.g_string, '/')) ORDER BY 1;
END;
SELF.g_string := arglist2varchar2(v_arglist, '/');
RETURN ODCIConst.Success;
END;

MEMBER FUNCTION ODCIAggregateTerminate(SELF IN t_string_agg
,returnValue OUT VARCHAR2
,flags IN NUMBER) RETURN NUMBER IS
BEGIN
returnValue := RTRIM(LTRIM(SELF.g_string, '/'), '/');
RETURN ODCIConst.Success;
END;

MEMBER FUNCTION ODCIAggregateMerge(SELF IN OUT t_string_agg
,ctx2 IN t_string_agg) RETURN NUMBER IS
BEGIN
SELF.g_string := SELF.g_string || '/' || ctx2.g_string;
RETURN ODCIConst.Success;
END;
END;

where as
CREATE OR REPLACE FUNCTION Varchar2ArgList(pv_string IN VARCHAR2
,p_Separator IN VARCHAR2) RETURN arg_list IS
vl_result arg_list := arg_list();
v_string VARCHAR2(32767) := pv_string;
v_Separator VARCHAR2(100) := nvl(p_Separator, chr(30));
BEGIN
WHILE instr(v_string, v_Separator, 1, 1) > 0 LOOP
vl_result.EXTEND;
vl_result(vl_result.LAST) := substr(v_string, 0, instr(v_string, v_Separator, 1, 1) - 1);
v_string := substr(v_string, instr(v_string, v_Separator, 1, 1) + 1, length(v_string) - instr(v_string, v_Separator, 1, 1));
END LOOP;
vl_result.EXTEND;
vl_result(vl_result.LAST) := v_string;
RETURN vl_result;
END Varchar2ArgList;

the STRAGG function worked fine in 10g but in 11gR2 we get strange results:

e.g. query in 10g
SELECT (stragg(DISTINCT x) OVER(PARTITION BY 1)
FROM (SELECT 'R' x FROM DUAL UNION ALL
SELECT 'R' x FROM DUAL UNION ALL
SELECT 'F' x FROM DUAL UNION ALL
SELECT 'R' x FROM DUAL UNION ALL
SELECT 'F' x FROM DUAL);
result is:
F/R
F/R
F/R
F/R
F/R
when you perform the same query in 11g the result is
-
-
-
-
-
the result of the unsupported/undocumented function wm_concat is the same (in 11g) (5 rows with -)
the returned value (-) depends on the strings to be concatenated.

can you please provide a fix for the t_string_agg object?

thanks a lot and best regards, klaus.
Tom Kyte
November 15, 2011 - 8:54 am UTC

ops$tkyte%ORA11GR2> edit
Wrote file afiedt.buf

  1  SELECT stragg(DISTINCT x) OVER(PARTITION BY 1)
  2    FROM (SELECT 'R' x FROM DUAL UNION ALL
  3          SELECT 'R' x FROM DUAL UNION ALL
  4          SELECT 'F' x FROM DUAL UNION ALL
  5          SELECT 'R' x FROM DUAL UNION ALL
  6*         SELECT 'F' x FROM DUAL)
ops$tkyte%ORA11GR2> /

STRAGG(DISTINCTX)OVER(PARTITIONBY1)
-------------------------------------------------------------------------------
F,R
F,R
F,R
F,R
F,R



my str_agg works dandy?

partition by 1 should just be "this space left intentionally blank" - that is just like "partition by 'hello world'"

what the heck is up with this:

   EXCEPTION
      WHEN OTHERS THEN
        SELECT * BULK COLLECT INTO v_arglist FROM 
TABLE(Varchar2ArgList(SELF.g_string, '/')) ORDER BY 1;
    END;



that is a bug in your code for sure, no way you can justify that snippet of code ever.

You'll need to do some debugging here - get the code down to just the relevant bits.


CREATE OR REPLACE TYPE BODY SYS.t_string_agg IS


sys? really? SYS??????

please don't do that - sys is ours, leave it be, do not ever put your stuff in there.

still got strange results

Klaus Eckhardt, November 15, 2011 - 11:28 am UTC

hi tom,
many thanks for your quick response.

part

SELECT to_number(column_value) BULK COLLECT INTO v_arglist FROM
TABLE(Varchar2ArgList(SELF.g_string, '/')) ORDER BY 1;
EXCEPTION
WHEN OTHERS THEN
SELECT * BULK COLLECT INTO v_arglist FROM TABLE(Varchar2ArgList(SELF.g_string, '/')) ORDER
BY 1;

from my type is: if the value cannot be casted to a number (for sorting) than use the varchar2-value. this worked fine in 10g

to
CREATE OR REPLACE TYPE BODY SYS.t_string_agg IS
sys? really? SYS??????
i wouldn't do this anymore (in german we would say 'Asche auf mein Haupt' don't know how to say it in english ~ throw ash on my head), but at the moment it's in the database.

in the meantime i tried the version of string_agg_type you posted at Followup March 18, 2004 - 10am Central time zone: but still get same results (rows with -)

when i don't use the DISTINCT the aggregation of strings works fine. i also put a DBMS_OUTPUT.PUT_LINE into
MEMBER FUNCTION ODCIAggregateIterate(SELF IN OUT string_agg_type
,VALUE IN VARCHAR2) RETURN NUMBER IS
BEGIN
data.extend;
data(data.count) := VALUE;
dbms_output.put_line(VALUE);
RETURN ODCIConst.Success;
END;

with distinct i get 2 lines back one with a - and one with a d
without DISTINCT i get 5 lines with Rs and Fs
it seems as if there is some casting going on if DISTINCT is used?

at the moment i'm a little bit desperate about this (next week we should bring the 10g database to 11gR2). could you please post the version of t_string_agg you used for your tests?

many thanks and best regards, klaus.
Tom Kyte
November 16, 2011 - 9:16 am UTC

from my type is: if the value cannot be casted to a number (for sorting) than
use the varchar2-value. this worked fine in 10g


you should be catching the very very specific error "cannot convert", not WHEN OTHERS. That is my point. when others should hardly ever be used, and definitely not here!

NLS_LANGUAGE is the cause why my stringagg is not working

Klaus Eckhardt, November 16, 2011 - 6:50 am UTC

hi tom,
i raised an SR today because of the issue. during the tests for solving the sr i found out that the cause for the strange results is the NLS_LANGUAGE.
after ALTER SESSION SET NLS_LANGUAGE='AMERICAN';

the query
SELECT StrAgg(DISTINCT x) OVER(PARTITION BY 1)
FROM (SELECT 'R' x FROM DUAL UNION ALL
SELECT 'R' x FROM DUAL UNION ALL
SELECT 'F' x FROM DUAL UNION ALL
SELECT 'R' x FROM DUAL UNION ALL
SELECT 'F' x FROM DUAL);
);
delivers the awaited result
F,R
F,R
F,R
F,R
F,R
when i set the session language back to german the result again is
- ☻ ,d ☻
- ☻ ,d ☻
- ☻ ,d ☻
- ☻ ,d ☻
- ☻ ,d ☻
(in sqlplus in a dos box the result is displayed a little bit different to just -)

so, many thanks to you and there is no need to post the version of stragg you used for testing my issue.

best regards, klaus.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.