Skip to Main Content
  • Questions
  • Read particular lines from CLOB columb

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, shirish.

Asked: October 23, 2012 - 2:57 am UTC

Last updated: October 30, 2012 - 7:01 pm UTC

Version: 11.2.0

Viewed 10K+ times! This question is

You Asked

Hi,

I have a table with column of CLOB data type..

Suppose i have a text in clob column as

"Hi i am SK142
I am also SK345
I am even SK346
SK444 is out of reach
This person SK567 is not valid"

From this text i need to fetch only word starting with SK---
Example : SK142, SK345, SK346, SK444, SK567 i need to get all of these.

How can i achieve this ?

and Tom said...

here is one approach:

ops$tkyte%ORA11GR2> drop table t;

Table dropped.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create table t ( id number, x clob );

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into t (id,x) values ( 1,
  2  'Hi i am SK142
  3  I am also SK345
  4  I am even SK346
  5  SK444 is out of reach
  6  This person SK567 is not valid');

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into t (id,x) values ( 2,
  2  'Hi i am SK142
  3  I am even SK346
  4  SK444 is out of reach
  5  This person SK567');

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create or replace function parse( p_clob in clob ) return sys.odciVarchar2List
  2  pipelined
  3  as
  4          l_offset number := 1;
  5          l_clob   clob := translate( p_clob, chr(13)|| chr(10) || chr(9), '   ' ) || ' ';
  6          l_end    number;
  7          l_hit    number;
  8          l_len    number := dbms_lob.getlength( p_clob );
  9  begin
 10          loop
 11                  l_hit := instr( l_clob, 'SK', l_offset );
 12                  exit when nvl(l_hit,0) = 0;
 13                  l_end := instr( l_clob, ' ', l_hit );
 14  
 15                  pipe row ( substr( l_clob, l_hit, l_end-l_hit+1 ) );
 16  
 17                  l_offset := l_end;
 18          end loop;
 19  end;
 20  /

Function created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select id, column_value from t, table(parse(t.x));

        ID COLUMN_VAL
---------- ----------
         1 SK142
         1 SK345
         1 SK346
         1 SK444
         1 SK567
         2 SK142
         2 SK346
         2 SK444
         2 SK567

9 rows selected.

Rating

  (5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Question on your solution

Nirav, October 24, 2012 - 6:40 am UTC

Hi Tom,
I am not much on pl/sql and trying to understand your solution.

What does this part of the code do or why is it there:
l_clob   clob := translate( p_clob, chr(13)|| chr(10) || chr(9), ' 
  ' ) || ' ';


Thanks
Nirav
Tom Kyte
October 25, 2012 - 8:52 am UTC

13,10,9 are the ascii codes for carriage return, linefeed and tab. I'm turning them into a space so that when something is at the end of a line or separated from the next word by a tab - we just have to look for a space

putting a space on the end makes it so that if the last word in the text is SKnnnn - we can still easily parse it out.


it shouldn't be a "I'm not much on plsql" - the code should be readable by any programmer pretty easily.

shirish katti, October 25, 2012 - 12:44 am UTC

Thanks for the answer... I worked on it a bit and found other solution to it some what similar to yours.. Please find the logic below just in case if you are interested

declare
v_str varchar2(500) := 'THIS SK123 IS THE NEXT SK2345 OF THE STRING SK34567 THE END';
v_pos NUMBER := 0;
begin
v_pos := INSTR(v_str, ' SK', v_pos+1)+1;
while SUBSTR(v_str, v_pos, 2) = 'SK'
loop
dbms_output.put_line(SUBSTR(v_str, v_pos, INSTR(v_str, ' ', v_pos+1)-v_pos));
v_pos := INSTR(v_str, ' SK', v_pos+1)+1;
end loop;
end;
Tom Kyte
October 25, 2012 - 9:06 am UTC

and what if the string starts with SK? you won't find ' SK'

and what if the string has SKnnn\n as this one does - instr( v_str, ' ' ) won't work.


Barry Chase, October 25, 2012 - 9:50 am UTC

l_clob clob := translate( p_clob, chr(13)|| chr(10) || chr(9), '
' ) || ' ';

translate CARRIAGERETURN||LINEFEED||TAB to block of spaces

From what I can tell it is putting everything on a single line that is within that CLOB so that the parsing is not performed across carriage returns, line feeds, or tabs. If none of those occur, then at a minimum you have a space from one clob to another if processing multiple clobs.
Tom Kyte
October 25, 2012 - 9:53 am UTC

the end space is to work with the condition that SKnnnn might be the last bit of text.


Also, i just noticed a possible issue, I should put a space on the front of the string too and look for
 ' SK' 
to find "words that start with" SK.

Parsing Text of Package Specs

Vinayak Awasthi, October 30, 2012 - 3:10 am UTC

Hi Tom,

I have a similar requirement. I have a package specs where we holds all the record types used in the application. However, now we have a new requirement where we have to extract text from this specs and compare record columns to a 3rd party file.

Can you suggest some ways of performing such extraction. Here is my sample specs and required outpu:

Create or replace package test_pk_specs
IS
AS
/******************************************************* *
* this is a smaple specs comment line 1
* this is a smaple specs comment line 2
* this is a smaple specs comment line 3
****************************************************/

-- Record type for Emp Code
TYPE rcd_emp_code IS RECORD (
id_type VARCHAR2 (30)
,emp_code VARCHAR2 (100)
,id_version NUMBER (38)
,create_date DATE);

-- Ref cursor for Emp Code
TYPE rc_emp_code IS REF CURSOR
RETURN rcd_emp_code;

-- Record type for Dept Code
TYPE rcd_dept_code IS RECORD (
id_type VARCHAR2 (30)
,dept_code VARCHAR2 (100)
--, dept_name VARCHAR2(1000) This is not required
,dept_city VARCHAR2(100));

TYPE rc_dept_code IS REF CURSOR
RETURN rcd_dept_code;

END test_pk_specs;
/

I problem I see is the way we can insert inline comments in the specs. The output I want is:

Record_Name Columns
rcd_emp_code id_type
rcd_emp_code emp_code
rcd_emp_code id_version
rcd_emp_code create_date

rcd_dept_code id_type
rcd_dept_code dept_code
rcd_dept_code dept_city

See in the dept record type, I do not see dept_name as that is commented out.

Can you please enlighten some approach here.

Tom Kyte
October 30, 2012 - 7:01 pm UTC

well, if your coding standards are like that (nice - very uniform), then

ops$tkyte%ORA11GR2> select rec_name, field_name
  2    from (
  3  select text,
  4         line,
  5         rec_name,
  6         case when row_number() over (partition by max_tag order by line) > 1
  7              then substr( ltrim( ltrim( text, ' ' ), ',' ), 1, instr( ltrim(ltrim(text, ' ' ), ','), ' ')-1 )
  8          end field_name,
  9         max_tag,
 10         lv_keep_flag,
 11         row_number() over (partition by max_tag order by line) rn
 12    from (
 13  select text,
 14         line,
 15         max(tag) over (order by line) max_tag,
 16         last_value(record_name ignore nulls) over (order by line) rec_name,
 17         last_value(keep_flag ignore nulls) over (order by line) lv_keep_flag
 18    from (
 19  select line,
 20         text ,
 21         case when text like '%TYPE % IS %RECORD %(%'
 22              then substr( text, instr(text,'TYPE ')+5, instr( text, ' IS ')-instr( text, 'TYPE ')-5 )
 23          end record_name,
 24         case when text like '%TYPE % IS %RECORD %(%'
 25              then row_number() over (order by line)
 26              when lag(text) over (order by line) like '%);%'
 27              then row_number() over (order by line)
 28          end tag,
 29         case when text like '%TYPE % IS %RECORD %(%'
 30              then 1
 31              when lag(text) over (order by line) like '%);%'
 32              then 0
 33          end keep_flag
 34    from user_source
 35   where name = 'TEST_PK_SPECS'
 36     and type = 'PACKAGE'
 37     and replace( text, ' ', '' ) NOT like '--%'
 38         )
 39         )
 40   where lv_keep_flag = 1
 41         )
 42   where rn > 1
 43   order by line;

REC_NAME        FIELD_NAME
--------------- ---------------
rcd_emp_code    id_type
rcd_emp_code    emp_code
rcd_emp_code    id_version
rcd_emp_code    create_date
rcd_dept_code   id_type
rcd_dept_code   dept_code
rcd_dept_code   dept_city

7 rows selected.

Thanks a lot !!!

Vinayak, November 15, 2012 - 4:38 am UTC

Your idea worked like charm....

Though , unfortunately, our code standards were followed as required and there were many deviations from the code I posted but I managed to get around them after getting direction from your approach.

SQL is overwhelmingly powerful...
Analytics rolls and rocks as usual....

Many thanks Tom

More to Explore

CLOB/BLOB/etc

Complete documentation on Securefiles and Large Objects here