Skip to Main Content
  • Questions
  • Handle records with newline delimeter in file using External Table

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, Avinash.

Asked: September 09, 2015 - 1:14 pm UTC

Last updated: September 18, 2015 - 10:13 am UTC

Version: 11.2.0.4

Viewed 10K+ times! This question is

You Asked

Hi Tom,

We are using external table to read file. But few fields in the file contains newline resulting in rejection to the bad file.

Could you please let us know how to handle such records through external tables.

CREATE TABLE IFC_DWH_SALESFORCE.EXT_TASK
(
ID VARCHAR2(256 CHAR),
TYPE VARCHAR2(256 CHAR),
CALLTYPE VARCHAR2(256 CHAR),
DESCRIPTION VARCHAR2(4000 CHAR),
STATUS VARCHAR2(256 CHAR),
OWNERID VARCHAR2(256 CHAR),
CREATEDDATE VARCHAR2(256 CHAR),
CALLDISPOSITION VARCHAR2(256 CHAR),
PRIORITY VARCHAR2(256 CHAR),
ACTIVITYDATE VARCHAR2(256 CHAR),
WHAT_ID VARCHAR2(256 CHAR),
WHAT_NAME VARCHAR2(256 CHAR),
WHAT_TYPE VARCHAR2(256 CHAR),
WHOID VARCHAR2(256 CHAR)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY SALESFORCE_DIR
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE
SKIP 1
badfile SALESFORCE_DIR:'exp_task.bad'
logfile SALESFORCE_DIR:'exp_task.log'
FIELDS TERMINATED BY ","
OPTIONALLY ENCLOSED BY '"'
MISSING FIELD VALUES ARE NULL )
LOCATION (SALESFORCE_DIR:'EXP_TASK.csv')
)
REJECT LIMIT UNLIMITED;

Sample file:

"ID","TYPE","CALLTYPE","DESCRIPTION","STATUS","OWNERID","CREATEDDATE","CALLDISPOSITION","PRIORITY","ACTIVITYDATE","WHAT.ID","WHAT.NAME","WHAT.TYPE","WHOID"
"00Tb0000002bHdtEAE","Call","","Per favore modificate i nomi delle opportunities. Grazie.","In Progress","005b0000000ECb4AAG","2013-03-12T20:53:17.000Z","","Normal","","","","",""
"00Tb0000002bHduEAE","Call","","Per favore modificate i nomi delle opportunities. Grazie.","In Progress","005b0000000ECb5AAG","2013-03-12T20:53:17.000Z","","Normal","","","","",""
"00Tb0000002bHjJEAU","Call","","Per favore modificate i nomi delle opportunities. Grazie.","In Progress","005b0000000ECbYAAW","2013-03-12T20:53:17.000Z","","Normal","","","","",""
"00Tb0000004S7C7EAK","Call","","Olga rekla, ze tam mas zavolat
nech si dat kontaktni osobu
To Uta: Dolaďují technické řešení s Dušanem a poté se ozvou, vedou nás v patrnosti a chtějí spolupracovat.","In Progress","005b0000000ETPGAA4","2013-05-21T08:15:14.000Z","","Normal","2013-05-24","001b000000659KWAAY","Siemens Electric Machines, s.r.o.","Account",""
"00Tb0000005KeNlEAK","Call","","Thomas, wir haben in SK neue Anfrage an Rahmen 250T eur. Bitte kannst du prüffen, ob wir dies aus Rahmen in DE schöpfen können? Da ist aber Art den Sicherheiten irgendeine Versicherung. Frag mal bitte nach in Kreditabteilung.
Danke K.","Not Started","005b0000000ED8jAAG","2013-06-24T12:56:30.000Z","","Normal","2013-06-24","006b0000003GEArAAO","Eldisy_Vstrekovací lis VCRS 500/115 TPE t c_96,9TEUR","Opportunity",""

Record with id = '00Tb0000005KeNlEAK' and '00Tb0000004S7C7EAK'
is having new line in the Description which results in rejecting the record in bad file.

The system which provides the file are not ready to change the record delimeter to any other like '|^' or any other. It is NEWLINE only.

Please let us know how to handle such scenario.

Regards,
Avinash Minz.

and Chris said...

You could use the preprocessor option. With this you specify a script to change the newline characters before Oracle reads the file.

For example, if you have a file like this:

"r1col1","singleline","end1"
"r2col1","multi
line","end2"

You can create a script, preprocess.sh, like:
#!/bin/bash
/bin/sed 's/"$/"##/' $1


This will convert the lines ending with double quotes to have ## at the end before Oracle reads it. Note the actual file remains the same.

You just need to change your "records delimited by" clause to be "##" instead of newline.

Putting it all together:

create or replace directory tmp_dir as '/tmp';

create table t (
  x char(10),
  y char(10),
  z char(10)
) organization external
( type oracle_loader
  default directory tmp_dir
  access parameters 
  ( records delimited by '##'
    preprocessor tmp_dir:'preprocess.sh'
    badfile tmp_dir:'exp_task.bad' 
    logfile tmp_dir:'exp_task.log' 
    fields terminated by ","
    enclosed by '"'
   )
   location (tmp_dir:'test.txt')
)
reject limit unlimited;

select rownum, t.* from t;

    ROWNUM X          Y          Z        
---------- ---------- ---------- ----------
         1 r1col1     singleline end1      
         2 r2col1     multi      end2      
                      line                 

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

External Table With Preprocessor

Avinash Minz, September 10, 2015 - 5:11 am UTC

Hi Chris,

Thank you for prompt response.
Preprocessor option is working fine.

Regards,
Avinash Minz.

External Table || Fileds with NEWLINE

Avinash Minz, September 10, 2015 - 8:40 am UTC

Hi Chris,

In one of the feed we are unable to query as it is giving the below error

ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-29400: data cartridge error
KUP-04020: found record longer than buffer size supported, 524288, in /home/dwhinst/dwh/files/SALESFORCE/EXP_EVENT.csv

As suggested we have incorporated the preprocessor option. One of the column named DESCRIPTION is having new line which is causing the problem.

The external table script is as below.

CREATE TABLE IFC_DWH_SALESFORCE.EXT_EVENT
(
ID VARCHAR2(256),
TYPE VARCHAR2(256),
DESCRIPTION VARCHAR2(4000),
OWNERID VARCHAR2(256),
ASSIGNED_COUNTRY_C VARCHAR2(256),
STARTDATETIME VARCHAR2(256),
ENDDATETIME VARCHAR2(256),
CREATEDDATE VARCHAR2(256),
SUBJECT VARCHAR2(256),
ACTIVITYDATE VARCHAR2(256),
ISPRIVATE VARCHAR2(256),
LOCATION VARCHAR2(256),
WHAT_ID VARCHAR2(256),
WHAT_NAME VARCHAR2(256),
WHAT_TYPE VARCHAR2(256),
WHOID VARCHAR2(256)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY SALESFORCE_DIR
ACCESS PARAMETERS
( RECORDS DELIMITED BY '|^'
PREPROCESSOR SALESFORCE_DIR:'preprocess.sh'
SKIP 1
BADFILE SALESFORCE_DIR:'exp_event.bad'
LOGFILE SALESFORCE_DIR:'exp_event.log'
FIELDS TERMINATED BY ","
OPTIONALLY ENCLOSED BY '"'
MISSING FIELD VALUES ARE NULL
)
LOCATION (SALESFORCE_DIR:'EXP_EVENT.csv')
)
REJECT LIMIT UNLIMITED;

Sample record is as below:

"ID","TYPE","DESCRIPTION","OWNERID","ASSIGNED_COUNTRY__C","STARTDATETIME","ENDDATETIME","CREATEDDATE","SUBJECT","ACTIVITYDATE","ISPRIVATE","LOCATION","WHAT.ID","WHAT.NAME","WHAT.TYPE","WHOID"
"00Ub000000CLbjIEAT","Visit","Liebe Kolleginnen und Kollegen,
ich hoffe, es geht Ihnen allen gut. Bevor nun die Urlaubssaison im August
in vielen Ländern den Höhepunkt erreicht und dann auch schon wieder das
Geschäftsjahr der DL zu Ende geht, möchte ich Ihnen einen erneuten
Termin für eine gemeinsame Telefonkonferenz vorschlagen:
Donnerstag, 3. September 14:00h (MESZ)

Thema:
Austausch über die aktuellen Erfahrungen in der Zusammenarbeit SCD / DL-GD
und Klärung von evtl. Fragen und Problemen.
Letztere können Sie mir gerne vorab zuleiten, damit ich die Telko besser vorbereiten kann.
Vielleicht macht es auch Sinn, wenn Sie im Vorfeld noch einmal auf Länderebene
bilateral miteinander telefonieren.
Hier die Einwahldaten:
Einwahlnummer: 030 – 25 55 76 10 bzw. aus dem Ausland +49-30-25 55 76 10
(nach dem Wählen kann es bis zu 15 Sek. dauern, bevor eine Verbindung zustande kommt)
Zugangscode: 677757#
(bitte das Telefon zuvor ggf. auf „Tonwahl“ / „MFV“ umschalten)
Unser Provider bietet auch die folgenden Einwahlnummern
im Ausland an, wenn das für Sie komfortabler ist, vgl. die Liste im Anhang.
<<...>>
Ich freue mich auf Ihre Zusagen.

Beste Grüße

Markus Ruppert
________________________
S-CountryDesk GmbH
- Koordination, Büro Bad Homburg -

c/o Deutsche Leasing AG
Sparkassen und Mittelstand –C.02.708-
Frölingstraße 15-31
61352 Bad Homburg v. d. Höhe
E-Mail: markus.ruppert@countrydesk.de
Tel.: 06172-88-1052
Mobil: 0172-348 7508
www.countrydesk.de < http://www.countrydesk.de/ >","005b0000001TuTuAAK","","2015-09-03T12:00:00.000Z","2015-09-03T13:00:00.000Z","2015-08-03T15:26:39.000Z","Telko S-CountryDesk / DL-GermanDesk","2015-09-03","false","Telefonkonferenz","","","",""

=============
preprocess.sh
=============
#!/bin/bash
/bin/sed 's/"$/"|^/' $1

Could you please what further changes to be done to handle such scenario.

Regards,
Avinash Minz.
Chris Saxon
September 18, 2015 - 10:13 am UTC

The record is larger than the size of the read buffer (512K). You can increase this by setting the readsize parameter to a larger value in the access parameters:

CREATE TABLE EXT_EVENT ( 
  ID VARCHAR2(256)
) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER 
  DEFAULT DIRECTORY ORDER_ENTRY 
  ACCESS PARAMETERS  ( 
   READSIZE 1M -- increased readsize
  ) 
  LOCATION (ORDER_ENTRY:'EXP_EVENT.csv') 
) ; 

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library