Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Lucy.

Asked: July 26, 2018 - 4:53 am UTC

Last updated: July 26, 2018 - 10:29 am UTC

Version: Oracle 11g

Viewed 10K+ times! This question is

You Asked

I am using External table to read a csv file, which has some rows with '#' at the beginning that need to be skipped. How can I do that?

ORGANIZATION EXTERNAL
  (  TYPE ORACLE_LOADER
     DEFAULT DIRECTORY GPC_DATA_CSV_DIR
     ACCESS PARAMETERS 
       ( RECORDS DELIMITED BY NEWLINE
     NOBADFILE NODISCARDFILE NOLOGFILE
     SKIP 1
     FIELDS TERMINATED BY ','
     MISSING FIELD VALUES ARE NULL
             )
     LOCATION (GPC_DATA_CSV_DIR:'simple.csv')
  )


I ended up not skipping any row, but when I used the external table , I just filter out the ones the first field start with #.

and Chris said...

Well, filtering out the values in your where clause is a valid method. If you're feeling lazy you can hide this in a view:

declare
  handle utl_file.file_type;
begin
  handle := utl_file.fopen ( 'TMP', 'test.txt', 'w' );
  
  utl_file.put_line ( handle, 'load,this,line' );
  utl_file.put_line ( handle, 'and,this,line' );
  utl_file.put_line ( handle, '#not,this,line' );
  utl_file.put_line ( handle, '#and,!this,line' );
  utl_file.fclose ( handle );
end;
/

drop table t cascade constraints purge;
create table t (
  c1 varchar2(10),
  c2 varchar2(10),
  c3 varchar2(10)
) organization external (
  default directory tmp
  location ( 'test.txt' )
);

select * from t;

C1     C2      C3     
load   this    line   
and    this    line   
#not   this    line   
#and   !this   line   

create or replace view vw as 
  select * from t
  where  c1 not like '#%';
  
select * from vw;

C1     C2     C3     
load   this   line   
and    this   line 


But if you're certain you never want to see these lines you can make the table bypass them. Use the load when clause to define which records to skip.

This has to use = or !=. Not like is out. You can't apply functions such as substr to the columns either.

So you need to use positional notation. Access the first character on the line with (1:1). And check it's not #:

drop table t cascade constraints purge;
create table t (
  c1 varchar2(10),
  c2 varchar2(10),
  c3 varchar2(10)
) organization external (
  default directory tmp
  access parameters (
    records delimited by newline
    load when ( 1:1 ) != '#' 
  )
  location ( 'test.txt' )
);

select * from t;

C1     C2     C3     
load   this   line   
and    this   line 


Is this answer out of date? If it is, please let us know via a Comment

More to Explore

Design

New to good database design? Check out Chris Saxon's full fundamentals class.