Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Winston.

Asked: September 07, 2004 - 5:15 pm UTC

Last updated: June 24, 2005 - 6:44 pm UTC

Version: 9iR2

Viewed 1000+ times

You Asked

Recently I used HTML DB to develop a function that allows people to upload trace files and then view the analyzed result online. I tried two approaches:

1) upload a file to BLOB, then use BLOB2_TO_VARCHAR2 to read the BLOB content back as a table of varchar2, then do the parsing/statistics and insert the result to a table.

2) upload a file to BLOB, then ftp blob to /tmp/, then create external table based on the file, then do the parsing/statistics on external table and insert result to a table.

The approach 1) is not as reliable as approach 2) as I found out difference chunk size might have different outcome when converting BLOB to a table of varchar2, and I need to use different line breaks for Unix and Windows file. However approach 1) allows me to use bind variables easily. Approach 2) is harder to automate and I only did it manually so far.

Tom, which approach would you recommend or would you like to recommend another one?

Thanks in advance,
Winston

and Tom said...

You could consider option "3" which is a simple merge of 1 and 2 sort of....

(you could rely on the fact that the newline will be 13/10 or just 10 -- i look for 0A below -- 10, you could just rtrim( ..., chr(13) ) in the pipe row to allow for windows -- since chr(13) shouldn't be there anyway...)


This little utility lets you query any single blob as if it were a table with "rows"



ops$tkyte@ORA9IR2> create or replace type array as table of varchar2(4000)
2 /

Type created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace function query_bfile( p_cursor in sys_refcursor ) return array
2 PIPELINED
3 as
4 l_bfile blob;
5 l_last number;
6 l_current number;
7 begin
8 fetch p_cursor into l_bfile;
9 if ( p_cursor%notfound )
10 then
11 close p_cursor;
12 raise NO_DATA_FOUND;
13 end if;
14 close p_cursor;
15
16 l_last := 1;
17 loop
18 l_current := dbms_lob.instr( l_bfile, '0A', l_last, 1 );
19 exit when (nvl(l_current,0) = 0);
20
21 pipe row
22 ( utl_raw.cast_to_varchar2(
23 dbms_lob.substr( l_bfile, l_current-l_last+1,
24 l_last ) )
25 );
26 l_last := l_current+1;
27 end loop;
28 return;
29 end;
30 /

Function created.

ops$tkyte@ORA9IR2> variable n number
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec :n := 2;

PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select rownum, column_value
2 from table( query_bfile( cursor( select theBlob from demo where id = :n ) ) )
3 where rownum <= 20;

ROWNUM COLUMN_VALUE
---------- --------------------------------------------------------------------------------
1 /home/ora9ir2/admin/ora9ir2/udump/ora9ir2_ora_2373.trc
2 Oracle9i Enterprise Edition Release 9.2.0.5.0 - Production
3 With the Partitioning, OLAP and Oracle Data Mining options
4 JServer Release 9.2.0.5.0 - Production
5 ORACLE_HOME = /home/ora9ir2
6 System name: Linux
7 Node name: xtkyte-pc.us.oracle.com
8 Release: 2.4.21-4.ELsmp
9 Version: #1 SMP Fri Oct 3 17:52:56 EDT 2003
10 Machine: i686
11 Instance name: ora9ir2
12 Redo thread mounted by this instance: 1
13 Oracle process number: 16
14 Unix process pid: 2373, image: oracle@xtkyte-pc.us.oracle.com (TNS V1-V3)
15
16 *** 2004-09-08 07:26:19.819
17 *** SESSION ID:(13.377) 2004-09-08 07:26:19.818
18 APPNAME mod='SQL*Plus' mh=3669949024 act='' ah=4029777240
19 =====================
20 PARSING IN CURSOR #1 len=69 dep=0 uid=48 oct=42 lid=48 tim=1068987089667331 hv=2
004533713 ad='5da0d70c'


20 rows selected.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec :n := 5;

PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select rownum, column_value
2 from table( query_bfile( cursor( select theBlob from demo where id = :n ) ) )
3 where rownum <= 20;

ROWNUM COLUMN_VALUE
---------- --------------------------------------------------------------------------------
1 /home/ora9ir2/admin/ora9ir2/udump/ora9ir2_ora_2528.trc
2 Oracle9i Enterprise Edition Release 9.2.0.5.0 - Production
3 With the Partitioning, OLAP and Oracle Data Mining options
4 JServer Release 9.2.0.5.0 - Production
5 ORACLE_HOME = /home/ora9ir2
6 System name: Linux
7 Node name: xtkyte-pc.us.oracle.com
8 Release: 2.4.21-4.ELsmp
9 Version: #1 SMP Fri Oct 3 17:52:56 EDT 2003
10 Machine: i686
11 Instance name: ora9ir2
12 Redo thread mounted by this instance: 1
13 Oracle process number: 16
14 Unix process pid: 2528, image: oracle@xtkyte-pc.us.oracle.com (TNS V1-V3)
15
16 *** 2004-09-08 07:52:01.631
17 *** SESSION ID:(14.222) 2004-09-08 07:52:01.631
18 APPNAME mod='SQL*Plus' mh=3669949024 act='' ah=4029777240
19 =====================
20 PARSING IN CURSOR #1 len=69 dep=0 uid=48 oct=42 lid=48 tim=1068988595343801 hv=2
004533713 ad='5da0d70c'


20 rows selected.


Rating

  (16 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

could we have used a "clob" here?

A reader, September 08, 2004 - 4:56 pm UTC

and avoided some back and forth conversion?

Tom Kyte
September 08, 2004 - 7:36 pm UTC

utl_raw.cast_to_varchar2 is virtualy a "no-op" -- all it does is update the datatype byte field in the variable. it doesn't really do anything.

</code> http://docs.oracle.com/docs/cd/B10464_05/web.904/b10357/concept.htm#1005883 <code>
the file upload facility uses a long raw or blob -- only choices.

ok - thanx!!

A reader, September 08, 2004 - 7:47 pm UTC


Thanks a lot!

Winston, September 08, 2004 - 11:25 pm UTC

query_bfile works well!

why are they so different in performance?

Winston, September 14, 2004 - 6:27 pm UTC

External file size: 131033112

ls -ls sql*
128096 -rw-r--r-- 1 oracle oracle 131033112 Sep 10 17:22 sqltrace_qa1.txt

CLOB length:

select dbms_lob.getlength(theclob) from demo where id=120;

DBMS_LOB.GETLENGTH(THECLOB)
---------------------------
131033112

insert /*+ append */ into trace_result
select 137322304712000 file_id,
max(theop) myop,
max(text) thetext,
max(exec_time) exec_time,
max(total_sort_time) total_sort_time
from (
select rn, max(mark) over (order by rn) mark , theop,
text, exec_time, stop_time,
user_cpu, sorts_cnt, total_sort_time from (
select rownum rn ,
COLUMN_VALUE line,
case when (COLUMN_VALUE like ' Operation:%' or COLUMN_VALUE like 'Operation:%') then rownum else null end mark ,
case when COLUMN_VALUE like ' Operation: %' then substr(COLUMN_VALUE,13) else null end theop ,
case when COLUMN_VALUE like ' Text : %' then substr(COLUMN_VALUE,12) else null end text ,
case when COLUMN_VALUE like ' Exec Time: %' then replace( substr(COLUMN_VALUE,13),'seconds','') else null end exec_time,
case when COLUMN_VALUE like ' Total sort time %' then substr(COLUMN_VALUE,18) else null end total_sort_time
from
(select log_line COLUMN_VALUE from <external table.sqltrace_qa1>
)
)
)
where mark is not null
group by mark;

took 331 Seconds while using something like query_bfile to replace external table took 8,104 Seconds.

Why External table is 24 times faster than Converting clob into table of varchar2.

I noticed that with CLOB approach most waits are "direct path read (lob)"?

Is there any way to improve CLOB approach?

Thanks a lot,
Winston

Tom Kyte
September 14, 2004 - 7:36 pm UTC

external tables are written in C and just basically parse data.

query_bfile isn't.


you could probably improve the bfile by reading 32k into memory -- processing as much of that as you can (there will be a broken last line) and adding 32k-length(remaining_text) to it (less function calls)

Sort on Clob Column

reader from NY, October 15, 2004 - 4:59 pm UTC

Tom,

How do I sort on a clob column ?

Thanks!!!

Tom Kyte
October 15, 2004 - 6:07 pm UTC

tell us what is "means to sort on a clob column" to you.

do you *really* want to sort based on an upto 4gig piece of data?

optimized as suggested

Alberto Dell'Era, October 17, 2004 - 7:44 am UTC

>you could probably improve the bfile by reading 32k into memory--processing as
>much of that as you can (there will be a broken last line) and adding
>32k-length(remaining_text) to it (less function calls)

I've implemented your suggestion - also making the read buffer size a multiple of the lob chunksize and a few other optimizations.

On my machine, for a NOCACHE BLOB (8192 bytes chunksize), it is around 15 times faster than the original implementation - which means that we are probably only half slower than an external table (if Winston checked on a nocache blob too). Not too bad.

btw I've checked for regression only on my test collection of 10046 trace files - but i foresee no reason while it couldn't be used for generic files, too.

Here's the (commented) code - hth people building their own "yet another 10046 trace analyzer" :)

bye
Alberto

create or replace type tr_string_array as table of varchar2(4000);
/
----------------------------------------
function blob_as_lines ( p_cursor sys_refcursor )
return tr_string_array
pipelined
as
l_bfile blob;
l_last number;
l_current number;
l_line_sep raw(2);
l_line_sep_length number;
l_chunk_size number;
l_read_buffer raw (32767);
l_sliding_window raw (32767);
l_optimal_read_size number;
l_offset integer default 1;
l_read_size integer;
l_read_buffer_extracted integer default 1;
l_sliding_window_remaining integer default 0;
l_sliding_window_length integer;

------------------------------------
-- returns the line separator [Unix: chr(10), Windows: chr(13)||chr(10)]
function calc_line_sep (p_blob blob)
return raw
is
l_0a_pos number;
l_char_before_0a raw(1);
begin
-- get first chr(10) position
l_0a_pos := dbms_lob.instr( p_blob, '0A', 1, 1 );

-- lucky strike - first char is chr(10) - it's Unix
if (l_0a_pos = 1) then
return hextoraw('0A');
end if;

-- get char before first chr(10)
l_char_before_0a := dbms_lob.substr (p_blob, 1, l_0a_pos-1);

-- return separator
if l_char_before_0a = hextoraw ('0D') then
return hextoraw('0D0A');
else
return hextoraw('0A');
end if;
end calc_line_sep;
begin
fetch p_cursor into l_bfile;
if ( p_cursor%notfound ) then
close p_cursor;
raise no_data_found;
end if;
close p_cursor;

-- calculate the line separator
l_line_sep := calc_line_sep (l_bfile);
l_line_sep_length := utl_raw.length (l_line_sep);

-- calculate the optimal buffer size (greatest multiple of blob chunk size)
l_chunk_size := dbms_lob.getchunksize (l_bfile);
l_optimal_read_size := trunc (32767 / l_chunk_size) * l_chunk_size;
if l_optimal_read_size = 0 then -- chunk size > max pl/sql raw length (32767)
l_optimal_read_size := 32767;
end if;

-- Algorithm (unrolled):
-- 1) read from blob into l_read_buffer
-- 2) copy l_read_buffer into l_sliding_window
-- 3) pipe complete lines contained in l_sliding_window
-- 4) slide the window (move last fragmented line to l_sliding_window left)
-- 5) read from blob into l_read_buffer
-- 6) fill l_sliding_window, appending as much as possible from l_read_buffer
-- 7) repeat (3) and (4)
-- 8) append remaining bytes from l_read_buffer to l_sliding_window
-- 9) repeat 5-8 ad libitum
loop

-- read "optimal size bytes" from blob into l_read_buffer
l_read_size := l_optimal_read_size;
dbms_lob.read ( l_bfile, l_read_size, l_offset, l_read_buffer );
l_offset := l_offset + l_read_size;

-- exit when blob EOF and no more bytes remain in l_sliding_window
exit when l_read_size + l_sliding_window_remaining = 0;

-- fill l_sliding_window, appending as much as possible from l_read_buffer
l_read_buffer_extracted :=
least (32767 - l_sliding_window_remaining, l_read_size);

l_sliding_window := utl_raw.concat (
l_sliding_window,
utl_raw.substr ( l_read_buffer, 1, l_read_buffer_extracted )
);

l_sliding_window_length := utl_raw.length (l_sliding_window);

-- pipe complete lines
l_last := 1;
loop
-- get the next line separator occurrence
l_current := instrb ( utl_raw.cast_to_varchar2 ( l_sliding_window ),
utl_raw.cast_to_varchar2 ( l_line_sep ),
l_last);

-- "special cases" (not probably going to occur for 10046 trace files,
-- but it's nice to be robust).
-- If the remaining line fragment is longer than 4000,
-- pipe first 4000 chars and continue;
-- if the current line is longer than 4000,
-- pipe first 4000 chars and continue.
if ( l_current = 0 and (l_sliding_window_length - l_last) > 4000 )
or ( (l_current-l_last) > 4000 )
then
-- truncate to 4000 bytes (<= 4000 chars)
l_current := l_last + 4000;
end if;

exit when l_current = 0;

pipe row
(
substrb ( utl_raw.cast_to_varchar2 ( l_sliding_window ),
l_last,
l_current-l_last
)
);


l_last := l_current + l_line_sep_length;
end loop;

-- slide the window
l_sliding_window :=
utl_raw.cast_to_raw
(

substrb
(
utl_raw.cast_to_varchar2 ( l_sliding_window ), l_last
)
);

-- append remaining bytes from l_read_buffer to l_sliding_window
-- implementation note: here, (l_read_size-l_read_buffer_extracted) <= 4000
-- and length(l_sliding_window) <= 4000, so no overflow is possible
-- (Thanks to the "special cases" mentioned above)
if (l_read_buffer_extracted < l_read_size) then
l_sliding_window := utl_raw.concat (
l_sliding_window,
utl_raw.substr ( l_read_buffer, l_read_buffer_extracted+1 )
);
end if;

l_sliding_window_remaining := nvl ( utl_raw.length ( l_sliding_window ),0);
end loop;

return;
end blob_as_lines;

Tom Kyte
October 17, 2004 - 10:15 am UTC

excellent -- thanks!

It does improved a lot

Winston, October 18, 2004 - 5:31 pm UTC

I ran the test again:
load 131033112 into a blob, 
and used Alberto's function to query blob and inserted into a table:

external table:       331    seconds
clob + query_file +without optimization:8104    seconds
blob + Alberto's optimization:    1905    seconds

Table containing blob field was defined like this:
REM  CREATE TABLE "WZHG"."DEMO_BLOB" ("ID" NUMBER, "THEBLOB" BLOB) PCTFREE 
REM  10 PCTUSED 40 INITRANS 1 MAXTRANS 255 STORAGE(INITIAL 65536 FREELISTS 
REM  1 FREELIST GROUPS 1) TABLESPACE "WZHANG_DATA" LOGGING NOCOMPRESS 
REM  LOB ("THEBLOB") STORE AS (TABLESPACE "wzhang_test_DATA" ENABLE STORAGE 
REM  IN ROW CHUNK 4096 PCTVERSION 10 NOCACHE STORAGE(INITIAL 65536 
REM  FREELISTS 1 FREELIST GROUPS 1)) ;

benchmarked operation is:

create or replace procedure run_this3 as
begin
insert into monitor_job(JOB_NAME,JOB_DT)  values ('run_this3', sysdate);
insert /*+ append */ into wzhg.db2trace_result 
select 137322300012119 file_id, 
max(theop)  myop,
max(text) thetext,
max(exec_time)  exec_time,
max(start_time) start_time,
max(stop_time)  stop_time,
max(user_cpu)  user_cpu,
max(sorts_cnt) sorts_cnt,
max(total_sort_time) total_sort_time
from (
select rn, max(mark) over (order by rn) mark ,  theop,
       text, exec_time, start_time, stop_time, 
       user_cpu, sorts_cnt, total_sort_time from (
           select rownum rn ,
           COLUMN_VALUE  line,
           case when (COLUMN_VALUE like  '  Operation:%'  or  COLUMN_VALUE  like 'Operation:%') then  rownum else null end mark ,           
           case when COLUMN_VALUE like '  Operation: %'  then  substr(COLUMN_VALUE,13) else null end theop ,
           case when COLUMN_VALUE like '  Text     : %'  then  substr(COLUMN_VALUE,12) else null end text ,
           case when COLUMN_VALUE like '  Exec Time: %'  then replace( substr(COLUMN_VALUE,13),'seconds','')  else null end exec_time,
           case when COLUMN_VALUE like '  Start Time: %' then  substr(COLUMN_VALUE,13) else null end start_time ,
           case when COLUMN_VALUE like '  Stop Time: %'  then  substr(COLUMN_VALUE,13) else null end stop_time ,
           case when COLUMN_VALUE like '  User CPU: %'   then  replace(substr(COLUMN_VALUE,12), 'seconds','') else null end user_cpu ,
           case when COLUMN_VALUE like '  Sorts: %'      then  substr(COLUMN_VALUE,8) else null end sorts_cnt ,
           case when COLUMN_VALUE like '  Total sort time %' then  substr(COLUMN_VALUE,18) else null end total_sort_time 
           from
             (select  column_value
      from table( blob_as_lines( cursor( select theblob from demo_blob where id=209 ) 
        ) )
                   )
            )
   )
where   mark is not null
group by mark;    
insert into monitor_job(JOB_NAME,JOB_DT)  values ('run_this3_end', sysdate);
commit;

end;

 1* select job_dt, lead(job_dt) over ( order by job_dt) x , (lead(job_dt) over (order by job_dt)- job_dt)*24*3600 Secds  from monitor_job where job_name like 'run_this3%'
SQL> /

JOB_DT               X                         SECDS
-------------------- -------------------- ----------
2004-oct-18 12:43:37 2004-oct-18 13:15:22       1905
2004-oct-18 13:15:22


Blob size:
SQL> select dbms_lob.getlength(theblob) from demo_blob where id=209
  2  /

DBMS_LOB.GETLENGTH(THEBLOB)
---------------------------
                  131033112


 

Increased chunk size to 8192, but performance is similar to that of 4096.

Winston, October 18, 2004 - 8:23 pm UTC

CREATE TABLE "WZHG".DEMO_BLOB_chunk_8192 ("ID" NUMBER, "THEBLOB" BLOB) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 STORAGE(INITIAL 65536 FREELISTS
1 FREELIST GROUPS 1) TABLESPACE "CENTRATA_DATA" LOGGING NOCOMPRESS
LOB ("THEBLOB") STORE AS (TABLESPACE "WZHANG_DATA" ENABLE STORAGE
IN ROW CHUNK 8192 PCTVERSION 10 NOCACHE STORAGE(INITIAL 65536
FREELISTS 1 FREELIST GROUPS 1)) ;

Now the elapsed time changed to 1940 Seconds.

External table 331 seconds
clob + query file +without optimization 8,104 seconds
blob + Alberto's optimization with chunksize=4096 1,905 seconds
blob + Alberto's optimization with chunksize=8192 1,940 Seconds


Sort on Clob Column

A reader, October 19, 2004 - 4:09 pm UTC

Tom,

Actually I have some columns which are clobs and they contain a lot of data ...not 4 gig though ..
thus in the application the developers want it to be sorted

Tom Kyte
October 19, 2004 - 4:23 pm UTC

they can sort on any 4,000 bytes of the data using dbms_lob.substr -- but that is the biggest thing you are going to sort by.

Hey, I see another user ...

Alberto Dell'Era, October 20, 2004 - 6:15 am UTC

Winston, I'm glad to see that you have found my optimization useful ...

I've noticed that you have found less dramatical improvements than me.
I think it may be useful to know that my benchmark was run putting this command in a script:

select rownum r, t.* from table (
tr_utils_pkg.blob_as_lines
(
cursor (
select blob_content from tr_uploaded_files where fid = 88
)
)
) t;

and then running it from sqlplus, after suppressing the output using "set termout on".
So basically i've benchmarked the speed of blob_as_lines alone + network, while you are benchmarking the insert /*+ append */ too, and also the sql-pl/sql interface between your sql statemement and the pipelined function.

Different environments, different statements, and so YMMV applies, but perhaps, if you tested the routine in isolation, we might get some insights and perhaps improve the pipelined function (which is obviously a critical component of any HTMLDB-based 10046 trace analyzer) for all the users out there (you and me :).

hth & bye
Alberto

transbase database

A reader, December 22, 2004 - 9:41 pm UTC

Tom, have you heard about transbase database ?
it allows the data to be stored in DVD's and the user can use it offline with the application.

Is there similar feature available in Oracle also .

thanks in advance.

Tom Kyte
December 23, 2004 - 10:55 am UTC

you can put read only tablespaces on whatever media you like. There are even init parameters to say "don't bother touching all of those slow devices on startup, do a lazy open on them" in support of having tablespaces on slow media like that.



Read log file in Oracle

Laxman Kondal, March 25, 2005 - 12:15 pm UTC

Hi Tom

I am trying to read export log file into Oracle and saw you simple example (simple for you) in this page and trying to work on my requirements but unfortunately failed.

What wrong I am doing - can you please point it out here please.

ORA9R2> desc ei_log

Name Null? Type
------------ -------- -----------------------
ID NUMBER
LOG BINARY FILE LOB

ORA9R2> select count(*) from ei_log;

COUNT(*)
----------
2

1 row selected.

ORA9R2> CREATE OR REPLACE FUNCTION query_bfile(p_ref IN P_Global.g_rc )
2 RETURN P_Coll_Type.vc4000_ntab
3 pipelined
4 AS
5 l_bfile blob;
6 l_last NUMBER;
7 l_current NUMBER;
8
9 BEGIN
10 FETCH p_ref INTO l_bfile;
11 IF ( p_ref%NOTFOUND ) THEN
12 CLOSE p_ref;
13 RAISE NO_DATA_FOUND;
14 END IF;
15 CLOSE p_ref;
16 --
17 l_last := 1;
18 LOOP
19 l_current := dbms_lob.INSTR( l_bfile, '0A', l_last, 1 );
20 EXIT WHEN (NVL(l_current,0) = 0);
21 pipe ROW ( utl_raw.cast_to_varchar2(
22 dbms_lob.SUBSTR( l_bfile, l_current-l_last+1, l_last ) ) );
23 --
24 l_last := l_current+1;
25 END LOOP;
26 RETURN;
27 --
28 END query_bfile;
29 /

Function created.

ORA9R2> show err
No errors.
ORA9R2> variable n number
ORA9R2> exec :n := 2

PL/SQL procedure successfully completed.


N
----------
2

ORA9R2> select rownum, column_value
2 from table( query_bfile( cursor( select log from ei_log where id = :n)))
3 where rownum <= 20;
from table( query_bfile( cursor( select log from ei_log where id = :n)))
*
ERROR at line 2:
ORA-00932: inconsistent datatypes: expected NUMBER got FILE
ORA-06512: at "OPOM.QUERY_BFILE", line 10


ORA9R2> exec :n := 1

PL/SQL procedure successfully completed.


N
----------
1

ORA9R2> select rownum, column_value
2 from table( query_bfile( cursor( select log from ei_log where id = :n)))
3 where rownum <= 20;
from table( query_bfile( cursor( select log from ei_log where id = :n)))
*
ERROR at line 2:
ORA-00932: inconsistent datatypes: expected NUMBER got FILE
ORA-06512: at "OPOM.QUERY_BFILE", line 10

ORA9R2>

Thanks for your help.
Regards.


Tom Kyte
March 25, 2005 - 6:34 pm UTC

give entire test case -- you reference a package variable, i don't have that.

trim it down to the bare minimum but supply everything one would need to run the entire thing

Read log file in oracle as table column

Laxman Kondal, March 28, 2005 - 9:11 am UTC

Hi Tom

This IS the script I used.

CREATE OR REPLACE PACKAGE P_Global IS
....
TYPE g_rc IS REF CURSOR;
--
...
END P_Global;

CREATE OR REPLACE PACKAGE BODY P_Global IS
END P_Global;

CREATE OR REPLACE PACKAGE P_Coll_Type IS
...
TYPE vc4000_ntab IS TABLE OF VARCHAR2(4000);
...
--
END P_Coll_Type;

create or replace directory EXP_IMP_LOG_DIR as '/tmp';

create or replace procedure load_a_file(
p_id in number,
p_filename in varchar2 )
as
l_clob clob;
l_bfile bfile;
begin
insert into ei_log values ( p_id, empty_clob() )
returning theClob into l_clob;
l_bfile := bfilename( 'EXP_IMP_LOG_DIR', p_filename );
dbms_lob.fileopen( l_bfile );
dbms_lob.loadfromfile( l_clob, l_bfile,
dbms_lob.getlength( l_bfile ) );
dbms_lob.fileclose( l_bfile );
end;

CREATE OR REPLACE FUNCTION query_bfile(
p_ref IN P_Global.g_rc )
RETURN P_Coll_Type.vc4000_ntab
pipelined
AS
l_bfile blob;
l_last NUMBER;
l_current NUMBER;

BEGIN
FETCH p_ref INTO l_bfile;
IF ( p_ref%NOTFOUND ) THEN
CLOSE p_ref;
RAISE NO_DATA_FOUND;
END IF;
CLOSE p_ref;
--
l_last := 1;
LOOP
l_current := dbms_lob.INSTR( l_bfile, '0A', l_last, 1 );
EXIT WHEN (NVL(l_current,0) = 0);
pipe ROW ( utl_raw.cast_to_varchar2(
dbms_lob.SUBSTR( l_bfile, l_current-l_last+1, l_last ) ) );
--
l_last := l_current+1;
END LOOP;
RETURN;
--
END query_bfile;

----------------------------------------------------
Now I have all needed ( I think so)
----------------------------------------------------


ORA9R2> exec load_a_file( 1, 'exp_test.log' );

PL/SQL procedure successfully completed.

ORA9R2> commit;

ORA9R2> select dbms_lob.substr(theclob,4000,1) from ei_log where id=1;

DBMS_LOB.SUBSTR(THECLOB,4000,1)
---------------------------------------------------------------------------
Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses WE8ISO8859P1 character set (possible charset conversion)

About to export specified tables via Conventional Path ...
. . exporting table EI_ABC_ELEMENTS 8886 rows exported
. . exporting table EI_AWIS_SYS_MAPPINGS 0 rows exported
. . exporting table EI_BUDGET_YEARS 0 rows exported
. . exporting table EI_ESCALATIONS 0 rows exported
. . exporting table EI_FUNDING_AMOUNTS 0 rows exported
. . exporting table EI_FUNDING_DECISIONS 0 rows exported
. . exporting table EI_LOCATIONS 160 rows exported
. . exporting table EI_MAINT_OBJECTIVES 1008 rows exported
. . exporting table EI_OBJECTIVE_SETS 10 rows exported
. . exporting table EI_ORGANIZATIONS 0 rows exported
. . exporting table EI_QE_OBJECTIVES 3 rows exported
. . exporting table EI_QE_SCHEDULE_IMPACTS 0 rows exported
. . exporting table EI_READINESS_RULES 0 rows exported
. . exporting table EI_SYSTEMS 14 rows exported
. . exporting table EI_TASKS 15742 rows exported
Export terminated successfully without warnings.

1 row selected.


ORA9R2>

-- utlraw.sql and prvtrawb.plb where executed as sysdba

ORA9R2> select object_name, object_type, owner, status from dba_objects where object_name like '%RAW%';

OBJECT_NAME OBJECT_TYPE OWNER STATUS
------------------------------ ------------------ ---------- -------
/360577e5_OracleTypeRAW JAVA CLASS SYS VALID
PL/SQL LONG RAW TYPE SYS VALID
RAW TYPE SYS VALID
UTL_RAW PACKAGE SYS VALID
UTL_RAW PACKAGE BODY SYS VALID
oracle/sql/RAW JAVA CLASS SYS VALID
/360577e5_OracleTypeRAW SYNONYM PUBLIC VALID
UTL_RAW SYNONYM PUBLIC VALID
oracle/sql/RAW SYNONYM PUBLIC VALID

9 rows selected.

ORA9R2> select rownum, column_value
2 from table( query_bfile( cursor( select theClob from ei_log where id = 1)));
from table( query_bfile( cursor( select theClob from ei_log where id = 1)))
*
ERROR at line 2:
ORA-00932: inconsistent datatypes: expected NUMBER got CLOB
ORA-06512: at "OPOM.QUERY_BFILE", line 11


ORA9R2>
---------------------------------------------------------------

Some ting I am missing and unable to findout.

Thanks for your help.

Regards.




Tom Kyte
March 28, 2005 - 9:22 am UTC

11 FETCH p_ref INTO l_bfile;



you appear to have a clob in ei_log (not sure, you neglected the create table).

but given you load routine, it would appear to be a clob.


l_bfile is a blob. database column is a clob.



CREATE OR REPLACE FUNCTION query_bfile(
p_ref IN P_Global.g_rc )
RETURN P_Coll_Type.vc4000_ntab
pipelined
AS
l_bfile clob;
l_last NUMBER;
l_current NUMBER;

BEGIN
FETCH p_ref INTO l_bfile;
IF ( p_ref%NOTFOUND ) THEN
CLOSE p_ref;
RAISE NO_DATA_FOUND;
END IF;
CLOSE p_ref;
--
l_last := 1;
LOOP
l_current := dbms_lob.INSTR( l_bfile, chr(10), l_last, 1 );
EXIT WHEN (NVL(l_current,0) = 0);
pipe ROW (
dbms_lob.SUBSTR( l_bfile, l_current-l_last+1, l_last ) );
--
l_last := l_current+1;
END LOOP;
RETURN;
--
END query_bfile;
/


but you might rename it now.

Read log file in oracle as table column

Laxman KOndal, March 28, 2005 - 12:02 pm UTC

Hi Tom

You found the mistake and now it works.

Thanks as always you are the source of The Oracle knowledge.

Regards.


Parsing a CLOB as an external table

Doug, June 24, 2005 - 3:01 pm UTC

I have a CLOB column that contains about 25,000 lines of fixed format data, like so:

12345Description of 12345 20050101
23456Description of 23456 20050102
34567Description of 34567 20050103
...

Instead of messing with SUBSTR and INSTR, I'd prefer to use the external table syntax to describe and process the data.

I'm currently writing the CLOB out to a file (using the handy clob_to_file() proc from a different asktom thread) then using an external table definition to process it. Not surprisingly, this turns out to be REALLY fast compared with the SUBSTR/INSTR approach.

Is there any way to avoid the step of writing the CLOB to a file? Is there a way to point an external table definition at a CLOB? Sure would be nice...

Tom Kyte
June 24, 2005 - 6:44 pm UTC

no, there isn't.

wuhd up!

r00t3r, July 01, 2005 - 1:39 pm UTC

I b3 w00nder1n wut j00r a11 ab0u7. 1m m0r3 r337 dan j00!

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.