You Asked

I have some situations where I have a report built in Apex with a huge query in the application. Ideally we like to keep as much code as possible on the database (for dependency checking purposes etc) however the reason these have been left in Apex is because the queries are parameterised and for whatever reason the predicates can't be pushed into views.
So one thing I've been considering is a package with a set of pipelined for these, where I have parameters to the function, run the query and then pipe the row out.

The main concern I have however is whether there is a performance risk in doing this - i.e. is one row at a time really processed or does the db bulk it up internally?

My initial tests look like there are some differences - but I'm not 100% sure on how to interpret the results - i.e. whether it's just an artifact of the method used or a genuine performance decrease?

create table xxtst as select mod(level,5) i, level j from dual connect by level <= 10000;

exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'xxtst');

create type xxtst_tab as table of number;
/

create or replace function xxtst_p(pi in number) return xxtst_tab pipelined is
begin
  for i in (select j from xxtst where i=pi) loop
    pipe row(i.j);
  end loop;
end;
/

set autotrace traceonly;

select j from xxtst where i=2;

select * from table(xxtst_p(pi=>2));

drop function xxtst_p;
drop type xxtst_tab;
drop table xxtst;

exit;

$> sp @tst.sql

SQL*Plus: Release 10.1.0.5.0 - Production on Wed Aug 30 12:01:07 2017

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


Table created.


PL/SQL procedure successfully completed.


Type created.


Function created.


2000 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=25 Card=2000 Bytes
          =14000)

   1    0   TABLE ACCESS (FULL) OF 'XXTST' (TABLE) (Cost=25 Card=2000
          Bytes=14000)





Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        162  consistent gets
          0  physical reads
          0  redo size
      37017  bytes sent via SQL*Net to client
       2182  bytes received via SQL*Net from client
        144  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2000  rows processed


2000 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=104 Card=8168 Byte
          s=16336)

   1    0   COLLECTION ITERATOR (PICKLER FETCH) OF 'XXTST_P' (PROCEDUR
          E) (Cost=104 Card=8168 Bytes=16336)





Statistics
----------------------------------------------------------
         53  recursive calls
          6  db block gets
        117  consistent gets
          0  physical reads
          0  redo size
      37028  bytes sent via SQL*Net to client
       2182  bytes received via SQL*Net from client
        144  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2000  rows processed


Function dropped.


Type dropped.


Table dropped.

Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

and Connor said...

Thanks for the nice test case.

Yeah, I'm not sure we can totally believe the autotrace results here, so I'll use a SQL trace so we can dig into the recursive calls.

SQL> create table xxtst as select mod(level,5) i, level j from dual connect by level <= 100000;

Table created.

SQL>
SQL> exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'xxtst');

PL/SQL procedure successfully completed.

SQL>
SQL> create or replace type xxtst_tab as table of number;
  2  /

Type created.

SQL>
SQL> create or replace function xxtst_p(pi in number) return xxtst_tab pipelined is
  2  begin
  3    for i in (select j from xxtst where i=pi) loop
  4      pipe row(i.j);
  5    end loop;
  6  end;
  7  /

Function created.

SQL> set arraysize 100
SQL> set feedback only
SQL> exec dbms_monitor.session_trace_enable

PL/SQL procedure successfully completed.

SQL> select j from xxtst where i=2;

20000 rows selected.

SQL> select * from table(xxtst_p(pi=>2));

20000 rows selected.

SQL> exec dbms_monitor.session_trace_disable

PL/SQL procedure successfully completed.

SQL> set feedback on
SQL> @tk

VALUE
-------------------------------------------------------------------------------------------------
VALUE1
-------------------------------------------------------------------------------------------------
C:\ORACLE\diag\rdbms\db122\db122\trace\db122_ora_4492.trc
C:\ORACLE\diag\rdbms\db122\db122\trace\db122_ora_4492.prf


1 row selected.


TKPROF: Release 12.2.0.1.0 - Development on Wed Sep 6 08:52:45 2017

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.



SQL>
SQL>
SQL>

In the trace file I see this:

Raw SQL
=======

select j 
from
 xxtst where i=2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          2           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      201      0.00       0.00          0        388          4       20000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      203      0.00       0.00          0        388          6       20000

Pipelined SQL
=============
SELECT J
FROM
XXTST WHERE I=:B1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 201 0.01 0.00 0 388 4 20000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 203 0.01 0.00 0 388 4 20000

select *
from
table(xxtst_p(pi=>2))

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 45 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 201 0.01 0.02 0 0 0 20000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 203 0.03 0.02 0 45 0 20000
</code>

So (as expected) in a case where you are doing *exactly* the same, ie, fetching from a table, then adding a pipeline function on top of that, does create an overhead, but you can also see, all the normal goodness applies, ie, we are doing array fetching both inside the function and from the function itself.

But of course the real benefits of pipelines come into play when the functionality is more complicated, ie

- fetch (or be passed) some source rows
- additional functionality to perturbate the data
- pipe out the computed results

Rating

(1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Stats?

John Keymer, September 06, 2017 - 7:26 am UTC

Thanks for the comprehensive reply. Yes, the actual cases I have are significantly more complicated, so there's probably a greater benefit to using PF's.

One thing though - I notice that with the pipelined function the cardinality estimates are out (it's using the PF default). Now I know we can set cardinality estimates using the cardinality hint (however that's static and I believe still unsupported) or dynamic sampling. So,

1) What is the best method for getting at least roughly accurate cardinality estimates on a PF?

2) I'm guessing using a PF negates our ability to use many other statistical concepts such as histograms, extended statistics etc etc... Can we "seed" any of these objects on a pipelined function?

3) Given the above, do you see it being possible (in a future release) to "gather statistics" on pipelined functions (in whatever sense)?

Thanks

September 07, 2017 - 2:41 am UTC

Adrian Billington did a great blog post on this topic a while back

http://www.oracle-developer.net/display.php?id=427

In 12c, for *all* queries, there is the potential for us to pick up cardinality discrepancies and use better information on subsequent executions.

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

Stats?

More to Explore

PL/SQL demos

PL/SQL docs

What overhead (if any) do pipelined functions have?

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

Stats?

More to Explore

PL/SQL demos

PL/SQL docs