Skip to Main Content
  • Questions
  • How to load multi - line record into oracle using sql loader

Breadcrumb

Dev Live Dev Intro

This month we are celebrating Developers at AskTOM. We welcome Developers of all levels of experience to join us at our FREE Developer Live events coming in August. Just click on the left to register today! If you are brand new to Database Technology, then we also have got you covered. Just click on the right for your comprehensive FREE training program to kick start your Oracle Database Development journey!

Question and Answer

Tom Kyte

Thanks for the question, maheen.

Asked: December 13, 2004 - 9:39 pm UTC

Answered by: Tom Kyte - Last updated: May 18, 2011 - 3:08 am UTC

Category: Database - Version: 9.0.2

Viewed 10K+ times! This question is

You Asked

I have a text file which contains millions of records. A complete records covers 5 lines of the text file. I just want these data into oracle

from each of these 5 lines

1st line: character starts from 1 to 10 ( character )
1st line: character starts from 67 to 87 (numerics)
2nd line: character starts from 55 to 65 ( numeric )
3rd line: character starts from 4 to 14 ( character)

there are total 118 character over each line.

What will be the control file?
thanks in advance


and we said...

you would use "concatenate 5" to glue 5 records together:

</code> http://docs.oracle.com/cd/B10501_01/server.920/a96652/ch05.htm#1005528 <code>


and then adjust your positions -- that is, say these are 5 - 100 character records, so record 1 is 1..100, record 2 will be 101..200 and so on.


Just use POSTITION to pick off the fields you want after that.

and you rated our response

  (16 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

Excellent

December 14, 2004 - 5:31 pm UTC

Reviewer: A reader


how to ignore some extra lines which are not relevent

December 17, 2004 - 8:35 pm UTC

Reviewer: Bob from Australia

Your advise to use concatenate is excellent. I also use this and found brilliant results. But I have another problem that there are some extra lines after some logical records, how to ignore them using ctl file.
I explain further:
In my text file the first 10 lines are headings then there are 11 logical records(one logical record covers 5 physical lines) and then again 10 lines that are extra(headings, not the actual data) and so on.throuout the file this will happen. What I have done is to manually remove them from the file. But one of my compititor use sql server to load same file without manually removing them.

2ndly is this essensial to rename the .txt file to .dat file

Tom Kyte

Followup  

December 18, 2004 - 8:50 am UTC

read about continueif, and "when" -- you might be able to use them. (lots of documentation on sqlldr)

if there is something in the records (which are physical records in sqlldr speak) that can be used to tell you that you need to glue this physical record with the prior one (instead of "10" then "11", then "10", then "11") you can use continueif to assemble them, and a WHEN clause to see if you want to process it.

Same Problem

December 23, 2004 - 10:01 pm UTC

Reviewer: Ahmad Imran from Pakistan

Hello, Merry Xmax to you.
I have the same problem like above, I tried WHEN and ContinueIF but seems not feasible here.

I have text file like this
( you can copy and paste to a txt file for better look)

DATE : 08/11/:4 P A K I S T A N T E L E C O M M U N I C A T I O N C O M P A N Y L T D. PAGE : 1

P A Y P H O N E B I L L R E G I S T E R BILL MONTH :OCT,04

REGION : A-F.T.R. DUE DATE: 24/11/04 COMPANY CODE :Q9
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
PHONE EXCH. C. WHTAX NET WHTAX PRS/READ NET-CLS LOC-CHR LINE-RENT TRUNK NWD CHG. SALES TAX ARR.GST ARREARS
INST.DATE NAME PRV/READ TST-CLS OTH.CHG. SP.FAC PHONO OVERSEAS CUR. BILL NET GST PAY. BEF. DUE
L.MOB.CH N.MOB.CH 800 CHG. VMS CHG. 900 CHG. REBATES SURCHARG PAY. AFT. DUE
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2550226-82 101 0.00 0.00 33015 559 1123.59 174 0.00 1141.68 476.00 0.00 0.46
14 11 2002 M/S GLAXY COMMUNICATION PVT 32456 0.00 18 0.00 13.34 3157.00 476.36 3157.00
C/O USMAN AMDAD PCO 613.20 110.00 0.00 0.00 0.00 513.46 697.00 3854.00
AMDAD ALI ST.NO.8 C-BLOCK
BABU WALA FAISALABAD


2550329-04 101 0.00 0.00 21863 1203 2418.03 174 0.00 1087.41 704.00 0.00 -0.06
02 06 2003 M/S GALAXY COMMUNI PVT LTD C 20660 0.00 18 0.00 0.00 4661.00 704.19 4661.00
MUHAMMAD IRSHAD 935.20 80.00 0.00 0.00 0.00 755.45 1026.00 5687.00
IMTIAZ PCO SHOP NO-1
P-108 AYUB COLONY GATE NO-1
JHANG ROAD FAISALAHAD

2550416-71 101 0.00 0.00 10588 631 1268.31 174 0.00 801.99 437.00 0.00 0.03
21 11 2003 M/S GLAXY COM PVT LTD C/O 9957 0.00 18 0.00 0.00 2913.00 436.83 2913.00
BISMILLAH ALI PCO GHULAM 616.00 52.00 0.00 0.00 0.00 454.15 622.00 3535.00
SARWAR P-3/3 GANDA NALA ROAD
RAHIM TOWN JADUGE WALA
FAISALABAD
DATE : 08/11/:4 P A K I S T A N T E L E C O M M U N I C A T I O N C O M P A N Y L T D. PAGE : 2

P A Y P H O N E B I L L R E G I S T E R BILL MONTH :OCT,04

REGION : A-F.T.R. DUE DATE: 24/11/04 COMPANY CODE :Q9
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
PHONE EXCH. C. WHTAX NET WHTAX PRS/READ NET-CLS LOC-CHR LINE-RENT TRUNK NWD CHG. SALES TAX ARR.GST ARREARS
INST.DATE NAME PRV/READ TST-CLS OTH.CHG. SP.FAC PHONO OVERSEAS CUR. BILL NET GST PAY. BEF. DUE
L.MOB.CH N.MOB.CH 800 CHG. VMS CHG. 900 CHG. REBATES SURCHARG PAY. AFT. DUE
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2552687-15 101 0.00 0.00 0 0 0.00 174 0.00 42.21 32.00 0.00 0.15
15 09 2004 M/S GLAXY COMMUNI PVT LTD C/ 0 0.00 00 0.00 0.00 238.00 32.43 238.00
LUCKY CSC CHUNGI NO-12 0.00 0.00 0.00 0.00 0.00 10.55 23.00 261.00
NEAR SAIF ABAD BUS STOP
JHANG ROAD FAISALABAD


2552874-93 101 0.00 400.00 1574 582 1169.82 174 0.00 2409.99 877.00 1028.39 8284.46
02 08 2004 M/S GLAXY COMMUNI PVT LTD C/ 992 348.00 00 0.00 433.55 5778.00 1905.25 14062.00
ABDUL QADER S/O MOLVI 1016.40 294.00 0.00 0.00 0.00 944.84 1281.00 15343.00
ABDUL REHMAN SABZI MANDI ROAD
NEAR DERA BHOL GUJJAR
KHALIDABAD FAISALABAD
DATE : 08/11/:4 P A K I S T A N T E L E C O M M U N I C A T I O N C O M P A N Y L T D. PAGE : 3

P A Y P H O N E B I L L R E G I S T E R BILL MONTH :OCT,04

REGION : A-F.T.R. DUE DATE: 24/11/04 COMPANY CODE :Q9
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
PHONE EXCH. C. WHTAX NET WHTAX PRS/READ NET-CLS LOC-CHR LINE-RENT TRUNK NWD CHG. SALES TAX ARR.GST ARREARS
INST.DATE NAME PRV/READ TST-CLS OTH.CHG. SP.FAC PHONO OVERSEAS CUR. BILL NET GST PAY. BEF. DUE
L.MOB.CH N.MOB.CH 800 CHG. VMS CHG. 900 CHG. REBATES SURCHARG PAY. AFT. DUE
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2655639-26 101 0.00 0.00 12378 627 1260.27 174 0.00 430.14 430.00 0.00 0.25
15 07 2003 M/S GLAXY COMMUNI PVT LTD C/ 11751 0.00 18 0.00 26.68 2952.00 430.42 2952.00
SHAH BAKERS P-159 946.40 32.00 0.00 0.00 0.00 366.25 532.00 3484.00
NEAR BABA AMJAD ALI
SHADAB COLONY
FAISALABAD

2656845-60 101 0.00 0.00 63224 887 1782.87 174 0.00 1756.74 798.00 0.00 0.49
10 02 2003 M/S GLAXY COMMUNI PVT C/O 62337 0.00 18 0.00 0.00 5339.00 797.82 5339.00
MUGHAL OPTICAL SERVICE 1397.20 208.00 0.00 0.00 0.00 795.75 1102.00 6441.00
P-172 FAZAL ELLAHI MKT OPP
M.C GIRLS SCHOOL PARTABNAGAR
FAISALABAD

2590166-82 102 0.00 0.00 1815 1680 3376.80 174 0.00 3427.05 1579.00 0.00 -0.09
28 09 2004 M/S GLAXY COMMUN. (PVT) LTD 135 0.00 00 0.00 320.16 10490.00 1578.54 10490.00
C/O SHAMA ESTATE AGENCY 2749.60 476.00 0.00 0.00 0.00 1612.16 2217.00 12707.00
CHOOR MAJRA MORE, G.M.ABAD
FAISALABAD.


I have a table like this

CREATE TABLE PTC(
PHONE NUMBER(7, 0),
EXCH NUMBER(4, 0),
CWHTAX NUMBER(9, 2),
NETWHTAX NUMBER(9, 2),
NETCALLS NUMBER(6, 0),
LINERENT NUMBER(9, 2),
TRUNK NUMBER(9, 2),
NWDCHARGES NUMBER(9, 2),
SALESTAX NUMBER(9, 2),
ARRGST NUMBER(9, 2),
ARREARS NUMBER(9, 2),
INSTDATE VARCHAR2(10),
TSTCALLS NUMBER(6, 0),
OTHCHARGES NUMBER(9, 2),
SPFAC NUMBER(9, 2),
PHONO NUMBER(9, 2),
OVERSEAS NUMBER(9, 2),
CURRBILL NUMBER(9, 2),
NETGST NUMBER(9, 2),
PAYBEFDUE NUMBER(9, 2),
LMOBCHARGES NUMBER(9, 2),
NMOBCHARGES NUMBER(9, 2),
S800CHARGES NUMBER(9, 2),
SVMSCHARGES NUMBER(9, 2),
S900CHARGES NUMBER(9, 2),
REBATE NUMBER(9, 2),
SURCHARGE NUMBER(9, 2),
PAYAFTDUE NUMBER(9, 2),
ADDR1 VARCHAR2(30),
ADDR2 VARCHAR2(30),
ADDR3 VARCHAR2(30),
ADDR4 VARCHAR2(30),
ADDR5 VARCHAR2(30)
)
;

and my control file is this

load data
infile 'test.dat'
replace
concatenate 7
into table PTC
(
PHONE POSITION(01:07) INTEGER EXTERNAL,
EXCH POSITION(13:16) INTEGER EXTERNAL,
CWHTAX POSITION(19:26) DECIMAL EXTERNAL,
NETWHTAX POSITION(28:36) DECIMAL EXTERNAL,
NETCALLS POSITION(48:53) INTEGER EXTERNAL,
LINERENT POSITION(67:72) DECIMAL EXTERNAL,
TRUNK POSITION(75:82) DECIMAL EXTERNAL,
NWDCHARGES POSITION(85:93) DECIMAL EXTERNAL,
SALESTAX POSITION(97:106) DECIMAL EXTERNAL,
ARRGST POSITION(109:117) DECIMAL EXTERNAL,
ARREARS POSITION(124:132) DECIMAL EXTERNAL,
INSTDATE POSITION(133:142) CHAR,
TSTCALLS POSITION(180:185) INTEGER EXTERNAL,
OTHCHARGES POSITION(188:196) DECIMAL EXTERNAL,
SPFAC POSITION(199:204) DECIMAL EXTERNAL,
PHONO POSITION(208:214) DECIMAL EXTERNAL,
OVERSEAS POSITION(216:225) DECIMAL EXTERNAL,
CURRBILL POSITION(229:238) DECIMAL EXTERNAL,
NETGST POSITION(241:249) DECIMAL EXTERNAL,
PAYBEFDUE POSITION(256:264) DECIMAL EXTERNAL,
LMOBCHARGES POSITION(299:307) DECIMAL EXTERNAL,
NMOBCHARGES POSITION(309:317) DECIMAL EXTERNAL,
S800CHARGES POSITION(322:328) DECIMAL EXTERNAL,
SVMSCHARGES POSITION(330:337) DECIMAL EXTERNAL,
S900CHARGES POSITION(339:346) DECIMAL EXTERNAL,
REBATE POSITION(349:357) DECIMAL EXTERNAL,
SURCHARGE POSITION(373:381) DECIMAL EXTERNAL,
PAYAFTDUE POSITION(387:396) DECIMAL EXTERNAL,
ADDR1 POSITION(144:173) char,
ADDR2 POSITION(269:298) char,
ADDR3 POSITION(401:430) char,
ADDR4 POSITION(533:562) char,
ADDR5 POSITION(665:694) char
)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Now the problem is that every logical record start from a digit like 2550226 in first case. then upto 7 line it is glue up I use CONCATENATE 7 for the purpose. If I manually remove the headings I am able to insert data but I do not want to manually change the file and remove the the headings by the loader. The problem is headings are contains in 10 lines while record contains in 7 lines. WHEN Clause is posible if there are equal number of records.

So what to do here is the question?

In a glance I put the quetion again that how loader starts reading lines that start with a digit.

Thanks for your time and suggestions
You are my last hope I have tried every other possibilites.



Tom Kyte

Followup  

December 24, 2004 - 8:57 am UTC

if your headings are 3 and your record is 7 i would say

your record is 10 and shift everything over. concatenate 10 -- position OVER the header fields.

How to tell loader to start with a particular alphabet

January 01, 2005 - 10:43 pm UTC

Reviewer: Michael Curtis from USA

In the text file alongwith the multi-record logical records there are scattered text which contains 5 lines and comes anywhere in the file(sometimes it comes after 2 logical records and sometimes it comes afrer 10 logical records) but it is confirm that every logical record start with an 'A' and then placed upto 4 lines. So we use "concatenate 4" here. Can there be a way to tell loader to start reading logical record when ever it is found 'A' in the start and left the physical lines otherwise.


Tom Kyte

Followup  

January 02, 2005 - 10:44 am UTC

'and left the physical lines'?

but the concatenate happens and then a filter can be applied, so I don't think so. I did not follow your narrative 100%, but it doesn't sound promising.

*

January 02, 2005 - 9:27 pm UTC

Reviewer: Michael Curtis from USA

'and left the physical lines'?
I mean ==> ignore the extra physical lines.

I have to consider two things:
1) the length of logical record = (4 physical line)
the length of "extra" physical lines = 5

So concatenate 4 is applied to glue logical records but not cover "5 extra" physical line and disturb the logical records.

I got two solutions:
1. Make each 4 lines as 1 logical record
2. Manually remove every extra line from the file.

But ...
Both of these solutions required to manually interact with the file, which is difficult for huge files.
So any suggestions please.
( Again is it posible to tell loader to start reading when it is an 'A' or 'B' or 'C' etc. and ignore otherwise)


Tom Kyte

Followup  

January 02, 2005 - 10:39 pm UTC

or option 3:

concatenate 9?

you seem to be indicating there is in fact a logical pattern here - 4 records we like, 5 we don't, 4 we do and so on.

so what's wrong with 9?

*

January 04, 2005 - 7:37 pm UTC

Reviewer: Michael Curtis from USA

From my previous reply

(sometimes it(extra lines) comes
after 2 logical records and sometimes it comes afrer 10 logical records and etc...)

So again concatenate 9 wouldn't work here

I am very sorry that I am taking your valueable time but I must say I always learn a lot from here
Here I give you once again detail
Let's say file start here After ||
|||||||||||||||||||||||||||||||||||||||||||||||||||
1st heading line (means extra lines)
2nd heading line (means extra lines)
3rd heading line (means extra lines)
4th heading line (means extra lines)
5th heading line (means extra lines)
1st record(line 1) which goes to database
1st record(line 2) which goes to database
1st record(line 3) which goes to database
1st record(line 4) which goes to database
1st heading line (means extra lines)
2nd heading line (means extra lines)
3rd heading line (means extra lines)
4th heading line (means extra lines)
5th heading line (means extra lines)
2nd record(line 1) which goes to database
2nd record(line 2) which goes to database
2nd record(line 3) which goes to database
2nd record(line 4) which goes to database
3rd record(line 1) which goes to database
3rd record(line 2) which goes to database
3rd record(line 3) which goes to database
3rd record(line 4) which goes to database
4th record(line 1) which goes to database
4th record(line 2) which goes to database
4th record(line 3) which goes to database
4th record(line 4) which goes to database
1st heading line (means extra lines)
2nd heading line (means extra lines)
3rd heading line (means extra lines)
4th heading line (means extra lines)
5th heading line (means extra lines)
5th record(line 1) which goes to database
5th record(line 2) which goes to database
5th record(line 3) which goes to database
5th record(line 4) which goes to database
1st heading line (means extra lines)
2nd heading line (means extra lines)
3rd heading line (means extra lines)
4th heading line (means extra lines)
5th heading line (means extra lines)
........(it may possible that here next 10 records comes consectively)
||||||||||||||||||||||||||||||||||||||||||
and so on?
Just like a situation that headings are deptno's and records are employees so each deptno may contains less or more employees
So i think now you better understanding with the problem.
Thanks anyways



Tom Kyte

Followup  

January 05, 2005 - 9:18 am UTC

probably not going to happen with SQLLDR

I can do it with SQL and external tables though.

i know! a hope is always there

January 05, 2005 - 10:03 pm UTC

Reviewer: Michael Curtis from USA

I put a sample emp table data in the simliar case as with me so you can easily understand the situation. Just copy and paste these text(below) into a .txt file and then game starts(:p).I does not put my own data just because of its no. of columns( it has 30+ columns) make it much complex.
Here you can assume datatypes just as for emp table
|||||||||||||||||||||||||||||||||||
empno ename
job mgr
hiredate sal
comm deptno
************************
7839 KING
PRESIDENT
17-NOV-81 5000
10
7782 CLARK
MANAGER 7839
09-JUN-81 2450
10
7934 MILLER
CLERK 7782
23-JAN-82 1300
10
empno ename
job mgr
hiredate sal
comm deptno
************************
7566 JONES
MANAGER 7839
02-APR-81 2975
20
7902 FORD
ANALYST 7566
03-DEC-81 3000
20
7369 SMITH
CLERK 7902
17-DEC-80 800
20
7788 SCOTT
ANALYST 7566
09-DEC-82 3000
20
7876 ADAMS
CLERK 7788
12-JAN-83 1100
20
empno ename
job mgr
hiredate sal
comm deptno
************************
7698 BLAKE
MANAGER 7839
01-MAY-81 2850
30
7654 MARTIN
SALESMAN 7698
28-SEP-81 1250
1400 30
7499 ALLEN
SALESMAN 7698
20-FEB-81 1600
300 30
7844 TURNER
SALESMAN 7698
08-SEP-81 1500
0 30
7900 JAMES
CLERK 7698
03-DEC-81 950
30
7521 WARD
SALESMAN 7698
22-FEB-81 1250
500 30
||||||||||||||||||||||||||||||||

Hope this time I achieve my goal that I can put this type of data without seeing and manually changing file into oracle database. In the current scenario each record starts with a digit.


Tom Kyte

Followup  

January 06, 2005 - 10:10 am UTC

i told you -- i'll do this with an external table and sql

but not sqlldr.

you can play with sqlldr all you want, I'm not even going to try -- this is like "loading a report" (they have an example of it in the documentation). involves triggers and all kinds of nasty stuff.

my suggestion -- load it into a table (or use external tables) and process with SQL.

Please do it using sql and external table

January 06, 2005 - 6:29 pm UTC

Reviewer: Michael Curtis from USA

My goal is to take data inside database without changing file contents, either using sqlldr or external table. So if you help me here using external table it's ok.
I do not have much experience so that's I put emp table above to understand this. Once you do this with emp table, i will do it with my table.

Millions of thanks in advance

Tom Kyte

Followup  

January 06, 2005 - 7:48 pm UTC

load input file into table with 

seq -- increasing number
X   -- varchar2(whatever)

and then:

ops$tkyte@ORA9IR2> select x, next_x1, next_x2, next_x3
  2    from (
  3  select substr(x,1,instr(x,' ' )-1) field1_x,
  4         trim( substr(x,instr(x,' '))) field2_x,
  5         x,
  6         lead(x,1) over (order by seq) next_x1,
  7         lead(x,2) over (order by seq) next_x2,
  8         lead(x,3) over (order by seq) next_x3
  9    from t
 10         )
 11   where translate( field1_x, '0123456789','0000000000' ) = rpad( '0', length(field1_x), '0' )
 12     and length( replace( translate( field2_x, '0123456789','0000000000' ), '0', '' ) ) = length( field2_x )
 13  /
 
X                              NEXT_X1                        NEXT_X2                        NEXT_X3
------------------------------ ------------------------------ ------------------------------ ------------------------------
7839        KING               PRESIDENT                      17-NOV-81   5000                           10
7782        CLARK              MANAGER     7839               09-JUN-81   2450                           10
7934        MILLER             CLERK       7782               23-JAN-82   1300                           10
7566        JONES              MANAGER     7839               02-APR-81   2975                           20
7902        FORD               ANALYST     7566               03-DEC-81   3000                           20
7369        SMITH              CLERK       7902               17-DEC-80   800                            20
7788        SCOTT              ANALYST     7566               09-DEC-82   3000                           20
7876        ADAMS              CLERK       7788               12-JAN-83   1100                           20
7698        BLAKE              MANAGER     7839               01-MAY-81   2850                           30
7654        MARTIN             SALESMAN    7698               28-SEP-81   1250               1400        30
7499        ALLEN              SALESMAN    7698               20-FEB-81   1600               300         30
7844        TURNER             SALESMAN    7698               08-SEP-81   1500               0           30
7900        JAMES              CLERK       7698               03-DEC-81   950                            30
7521        WARD               SALESMAN    7698               22-FEB-81   1250               500         30
 
14 rows selected.
 


You must build a where clause that does the right thing for your data, i was looking for a line where the first field was a number and the second field a string without numbers -- you need to customize to YOUR data. 

Wonderful

January 06, 2005 - 8:41 pm UTC

Reviewer: Michael from USA

I hope I will convert the file after some works.
It's really amazing thanks thanks alot

Loading a Large file into several tables

January 07, 2005 - 8:33 am UTC

Reviewer: denni50 from na

Hi Tom

We are about to download and import a large file(1.7 mil records).
If we import the data through the software
import module we will have to break it down into small
chunks of data(10,000 records at a time). The good thing
about the import module is that is loads the data into
the appropriate base tables..example:

first,middle,lastname gets imported into customer table
address1,city, state, zip imports into address table
areacode,phone,email imports into phone table...etc

I know sqlldr will not do this...after reading about
external tables here is it possible to get this done
via that mechanism.

thanks


Tom Kyte

Followup  

January 07, 2005 - 9:31 am UTC

sqlldr can load into multiple tables easily (you can have multiple into clauses)

but external tables are the way to go.

thanks Tom...

January 07, 2005 - 12:33 pm UTC

Reviewer: denni50 from na

will do some research.


scontinueif and Trailing NulCols

July 25, 2006 - 5:35 pm UTC

Reviewer: Rao from USA NJ

Tom


I have a file having carriage returns in one of the field(U_Address) in the records. I am getting data from MYSQL and cutting a flat file out of it, and then using loader to load into the oracle. But carriage return makes the one logical record into 2 physical records, which I have handled throguth the continueif statement. By doing this I have few rows are rejected because of trailing columns and sqlldr is sugesting me to use the trailing NulCols clause in the control file . When I use the trailing NulCols in the control file the data doest not get loaded into the right columns, means it get off one column, and skip the first columns from the second physical record . here is the sample data file

"*","Abc","Abc <<newline_character>>
","", "DEF","XYZ","CDE1"
"*","Abc1","Abc1 <<newline_character>>
","","DEF1","XYZ1","CDE1"


control file is

LOAD DATA
INFILE '/u01/app/oracle/faheem/w3t_Users_3.csv'
APPEND
CONTINUEIF Next(1)<>'"*"'
INTO TABLE W3T_USERS_TEST3
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(Record_term,
U_LOGINNAME ,
U_USERNAME ,
U_Address ,
U_EMAIL ,
U_FIRST_NAME ,
U_LAST_NAME )


First record
Record_term= <<blank>> ( Also I don't understand why sql loader keeping this as a blank eventhough there is a Asteric in the file)
U_USERNAME = Abc ,
U_Address =Abc <<newline_character>> ,
U_EMAIL = DEF( note it has skipped the blank column from the file) ,
U_FIRST_NAME =XYZ ,
U_LAST_NAME ="CDE1"


when I don't use the trailing NulCOls clause then Sql loader would error(Bad records) out those records which have carriage return in the U_Address field.

Now I don't understand that why does this happens



Please advice

Regards

Rao

Tom Kyte

Followup  

July 25, 2006 - 5:47 pm UTC

did you really mean to just "concatenate 2" - isn't it true that every two records is one?

continueif and Trailing NulCols

July 26, 2006 - 10:49 am UTC

Reviewer: Faheem from NJ USA

Tom,

Thanks for the response,

Actually it is not always true that 2 physical records makes one logical record. This is true only with the records having a newline character in the address field, which splits a logical record into 2 physical records in the flat file.

and what I have done is added a asteric while dumping the records into a flat file

select "*",Table_Name.* from table_name ( this is from Mysql)

this will add a "*" before the starts of a new logical records.In control file I have put the continueif clause , which is if the next physical record's first filed is not equal to "*", so the physical record which is there due to the an extra newline character in address filed, would be picked up by the loader only if I specify the trialing nullcols clasue (in which case it would skip the first field of 2nd physical record as I mentioned in my previous post) otherwise it would reject those records and put them in bad file .

Also Part of the question is the first field("*") which I am using in continueif clause is also not being populated in the target table, which is actually an asteric, I don't need it but I don't understand the concept of loader. may be this is causing the problem , may be because loader does not populates the field which is being used in the continueif clause and the first field of the second physical record is an actual field (not a *), so loader is checking that fied and comparing against the continueif clause and finding out that continueif clause is returning true and it actually continue with the second physical record but it skips that actual field.


Regards

Faheem



Tom Kyte

Followup  

July 26, 2006 - 11:42 am UTC

problem is the continueif is "eating" the beginning of each line.

instead of beginning each line with *, could you END each line with *?

<quote src=Expert Oracle Database Architecture>

Use the STR Attribute

This is perhaps the most flexible method of loading data with embedded newlines. Using the STR attribute, we can specify a new end-of-line character (or sequence of characters). This allows us to create an input data file that has some special character at the end of each line—the newline is no longer “special.”
I prefer to use a sequence of characters, typically some special marker, and then a newline. This makes it easy to see the end-of-line character when viewing the input data in a text editor or some utility, as each record still has a newline at the end of it. The STR attribute is specified in hexadecimal, and perhaps the easiest way to get the exact hexadecimal string we need is to use SQL and UTL_RAW to produce the hexadecimal string for us. For example, assuming we are on UNIX where the end-of-line marker is CHR(10) (linefeed) and our special marker character is a pipe symbol (|), we can write this:

ops$tkyte@ORA10G> select utl_raw.cast_to_raw( '|'||chr(10) ) from dual;
 
UTL_RAW.CAST_TO_RAW('|'||CHR(10))
-------------------------------------------------------------------------------
7C0A

which shows us that the STR we need to use on UNIX is X'7C0A'. 
Note    On Windows, you would use UTL_RAW.CAST_TO_RAW( '|'||chr(13)||chr(10) ).

To use this, we might have a control file like this:

LOAD DATA
INFILE demo.dat "str X'7C0A'"
INTO TABLE DEPT
REPLACE
FIELDS TERMINATED BY ','
TRAILING NULLCOLS
(DEPTNO,
  DNAME        "upper(:dname)",
  LOC          "upper(:loc)",
  COMMENTS
)

So, if our input data looks like this:

[tkyte@desktop tkyte]$ cat demo.dat
10,Sales,Virginia,This is the Sales
Office in Virginia|
20,Accounting,Virginia,This is the Accounting
Office in Virginia|
30,Consulting,Virginia,This is the Consulting
Office in Virginia|
40,Finance,Virginia,This is the Finance
Office in Virginia|

[tkyte@desktop tkyte]$

where each record in the data file ends with a |\n, the previous control file will load it correctly.
</quote> 

continueif and Trailing NulCols

July 26, 2006 - 1:05 pm UTC

Reviewer: Faheem from USA NJ


Tom

Thanks for very simple and useful solution to the problem. STR stratergy worked.

This is for my batter understanding of continueif clause, So continueif clause will eatup that filed , means how much it eats? means does it eatup only the field against are we checking(comparing) or even more ?
Please explain it a bit, and also is that a standard behavior of the loader?

Regards

Faheem

Tom Kyte

Followup  

July 26, 2006 - 1:39 pm UTC

continueif eats the same bit of each line, if you said "look at the first character of each line and continueif it is equal to this value", it would eat the first character.

sqlldr

July 22, 2009 - 9:35 am UTC

Reviewer: Antonio Jose Cantillo from Cali, Colombia

hi tom,

i need read multilines, the text file contain, one record one line, or sometimes one record multiples lines ... the problem is COMMENT_ (2000).

i try with this, and dont work
control.ctl

load data
infile 'MO.dat'
CONTINUEIF (1) = '"'
into table MO_COMMENT
fields terminated by "|" optionally enclosed by '"'
( COMMENT_ID ,COMMENT_ char(2000),MOTIVE_ID ,COMMENT_TYPE_ID ,CAUSAL_ID ,PACKAGE_ID ,REGISTER_DATE ,PERSON_ID ,ORGANIZAT_AREA_ID ,LIMIT_RESPONSE_DATE )


"1796050"|"MEDIANTE . EXPEDIENT"|""|"26"|""|"1917897"|""|""|""|""
"1796051"|"MEDIANTE OFICIO . DE MANTENIMIENTO 009.
EXPEDIENTE 026."|""|"26"|""|"1917897"|""|""|""|""


i try with this too, and dont work

load data
infile 'MO.dat'
CONTINUEIF (1) = '*'
into table MO_COMMENT
fields terminated by "|" optionally enclosed by '"'
( COMMENT_ID ,COMMENT_ char(2000),MOTIVE_ID ,COMMENT_TYPE_ID ,CAUSAL_ID ,PACKAGE_ID ,REGISTER_DATE ,PERSON_ID ,ORGANIZAT_AREA_ID ,LIMIT_RESPONSE_DATE )


"*"|"1796050"|"MEDIANTE . EXPEDIENT"|""|"26"|""|"1917897"|""|""|""|""
"*"|"1796051"|"MEDIANTE OFICIO . DE MANTENIMIENTO 009.
EXPEDIENTE 026."|""|"26"|""|"1917897"|""|""|""|""


one solution is delete the char <newline> with vi (:g/[^"]$/j!), but the size of text file is 5GB and is very slowly.

its posible read that records?

thx a lot..

Tom Kyte

Followup  

July 26, 2009 - 6:40 am UTC

Your file format must change - or you will not be using sqlldr to do this. We have no way to know when your line ends.

This is from "Expert Oracle Database Architecture" on loading with sqlldr:

<quote>

Load Data with Embedded Newlines

This is something that has been problematic for SQLLDR historically ¿ how to load free form data that may include a newline in it. The newline character is the default `end of line¿ character to SQLLDR, and the ways around this did not offer much flexibility in the past. Fortunately, in Oracle 8.1.6 and later versions we have some new options. The options for loading data with embedded newlines are now as follows:

o Load the data with some other character in the data that represents a newline (for example, put the string \n in the text where a newline should appear) and use a SQL function to replace that text with a CHR(10) during load time.
o Use the FIX attribute on the INFILE directive, and load a fixed length flat file.
o Use the VAR attribute on the INFILE directive, and load a varying width file that uses a format such that the first few bytes of each line is the length of the line to follow.
o Use the STR attribute on the INFILE directive to load a varying width file with some sequence of characters that represent the end of line, as opposed to just the newline character representing this.

We will demonstrate each in turn.

Use a Character Other than a Newline


This is an easy method if you have control over how the input data is produced. If it is easy enough to convert the data when creating the data file, this will work fine. The idea is to apply a SQL function to the data on the way into the database, replacing some string of characters with a newline. Lets add another column to our DEPT table:

ops$tkyte@ORA10G> alter table dept add comments varchar2(4000);
Table altered.


We¿ll use this column to load text into. An example control file with inline data could be:

LOAD DATA
INFILE *
INTO TABLE DEPT
REPLACE
FIELDS TERMINATED BY ','
TRAILING NULLCOLS
(DEPTNO,
  DNAME        "upper(:dname)",
  LOC          "upper(:loc)",
  COMMENTS     "replace(:comments,'\\n',chr(10))"
)
BEGINDATA
10,Sales,Virginia,This is the Sales\nOffice in Virginia
20,Accounting,Virginia,This is the Accounting\nOffice in Virginia
30,Consulting,Virginia,This is the Consulting\nOffice in Virginia
40,Finance,Virginia,This is the Finance\nOffice in Virginia


Notice how in the call to replace we had to use \\n not just \n. This is because \n is recognized by SQLLDR as a newline, and it would have converted it into a newline, not a two character string. When we execute SQLLDR with the above control file, the table DEPT is loaded with:

ops$tkyte@ORA10G> select deptno, dname, comments from dept;
 
    DEPTNO DNAME          COMMENTS
---------- -------------- -------------------------
        10 SALES          This is the Sales
                          Office in Virginia
 
        20 ACCOUNTING     This is the Accounting
                          Office in Virginia
 
        30 CONSULTING     This is the Consulting
                          Office in Virginia
 
        40 FINANCE        This is the Finance
                          Office in Virginia


Use the FIX Attribute


The FIX attribute is another method available to us. If you use this, the input data must appear in fixed length records. Each record will be exactly the same number of bytes as any other record in the input data set. When using positional data, this is especially valid. These files are typically fixed length input files to begin with. When using `free form¿ delimited data, it is less likely that you will have a fixed length file as these files are generally of varying length (this is the entire point of delimited files ¿ to make each line only as big as it needs to be).

When using the FIX attribute, we must use an INFILE clause, as this is an option to INFILE. Additionally, the data must be stored externally, not in the control file itself using this option. So, assuming we have fixed length input records, we can use a control file such as this:

LOAD DATA
INFILE demo.dat "fix 80"
INTO TABLE DEPT
REPLACE
FIELDS TERMINATED BY ','
TRAILING NULLCOLS
(DEPTNO,
  DNAME        "upper(:dname)",
  LOC          "upper(:loc)",
  COMMENTS
)


This specifies an input data file that will have records that are 80 bytes each. This includes the trailing newline that may or may not be there. In this case, the newline is nothing special in the input data file. It is just another character to be loaded or not. This is the thing to understand ¿ the newline at the end of the record (if present) will become part of the record. In order to fully understand this, we need a utility to dump the contents of a file on screen so we can see what is really in there. Using Unix (or any Linux variant), this is pretty easy to do with od, a program to dump files to the screen in octal and other formats. We¿ll use the following demo.dat file:

[tkyte@desktop tkyte]$ od -c -w10 -v demo.dat
0000000   1   0   ,   S   a   l   e   s   ,   V
0000012   i   r   g   i   n   i   a   ,   T   h
0000024   i   s       i   s       t   h   e
0000036   S   a   l   e   s  \n   O   f   f   i
0000050   c   e       i   n       V   i   r   g
0000062   i   n   i   a
0000074
0000106
0000120   2   0   ,   A   c   c   o   u   n   t
0000132   i   n   g   ,   V   i   r   g   i   n
0000144   i   a   ,   T   h   i   s       i   s
0000156       t   h   e       A   c   c   o   u
0000170   n   t   i   n   g  \n   O   f   f   i
0000202   c   e       i   n       V   i   r   g
0000214   i   n   i   a
0000226
0000240   3   0   ,   C   o   n   s   u   l   t
0000252   i   n   g   ,   V   i   r   g   i   n
0000264   i   a   ,   T   h   i   s       i   s
0000276       t   h   e       C   o   n   s   u
0000310   l   t   i   n   g  \n   O   f   f   i
0000322   c   e       i   n       V   i   r   g
0000334   i   n   i   a
0000346
0000360   4   0   ,   F   i   n   a   n   c   e
0000372   ,   V   i   r   g   i   n   i   a   ,
0000404   T   h   i   s       i   s       t   h
0000416   e       F   i   n   a   n   c   e  \n
0000430   O   f   f   i   c   e       i   n
0000442   V   i   r   g   i   n   i   a
0000454
0000466
0000500
[tkyte@desktop tkyte]$


Notice that in this input file, the newlines (\n) are not there to indicate where the end of the record for SQLLDR is ¿ rather they are just data to be loaded in this case. SQLLDR is using the fix width of 80 bytes to figure out how much data to read. In fact ¿ if you look at the input data, the records for SQLLDR are not even terminated by \n in this input file ¿ the character right before department 20¿s record is a space ¿ not a newline.

Now that we know each and every record is 80 bytes long, we are ready to load it using the control file we listed above with the FIX 80 clause. When we do so, we can see:

ops$tkyte@ORA10G> select '"' || comments || '"' comments from dept;
 
COMMENTS
-------------------------------------------------------------------------------
"This is the Sales
Office in Virginia                          "
 
"This is the Accounting
Office in Virginia                "
 
"This is the Consulting
Office in Virginia                "
 
"This is the Finance
Office in Virginia                      "


You might need to ¿trim¿ this data ¿ since the trailing whitespace is preserved. You can do that right in the control file using the TRIM built-in SQL function.

A word of caution for those of you lucky enough to work on both Windows and UNIX. The end of line marker is different on these platforms. On UNIX it is simply \n. On Windows NT is it \r\n. In general, if you use the FIX approach, make sure to create and load the file on a homogenous platform (UNIX and UNIX, or Windows and Windows).

Use the VAR Attribute


Another method of loading data with embedded newline characters is to use the VAR attribute. When using this format, each record will begin with some fixed number of bytes that represent the total length of the incoming record. Using this format, I can load varying length records that contain embedded newlines, but only if I have a record length field at the beginning of each and every record. So, if I use a control file such as:

LOAD DATA
INFILE demo.dat "var 3"
INTO TABLE DEPT
REPLACE
FIELDS TERMINATED BY ','
TRAILING NULLCOLS
(DEPTNO,
  DNAME        "upper(:dname)",
  LOC          "upper(:loc)",
  COMMENTS
)


then the var 3 says that the first three bytes of each input record will be the length of that input record. If I take a data file such as:

[tkyte@desktop tkyte]$ cat demo.dat
05510,Sales,Virginia,This is the Sales
Office in Virginia
06520,Accounting,Virginia,This is the Accounting
Office in Virginia
06530,Consulting,Virginia,This is the Consulting
Office in Virginia
05940,Finance,Virginia,This is the Finance
Office in Virginia
[tkyte@desktop tkyte]$


I can load it using that control file. In my input data file, I have four rows of data. The first row starts with 055, meaning that the next 55 bytes represent the first input record. This 55 bytes includes the terminating newline after the word Virginia. The next row starts with 065. It has 65 bytes of text, and so on. Using this format data file, we can easily load our data with embedded newlines.

Again, if you are using UNIX and Windows (the above example was UNIX where a newline is one character long), you would have to adjust the length field for each record. On Windows the above .DAT file would have to have 56, 66, 66, and 60 as the length fields in this particular example.

Use the STR Attribute


This is perhaps the most flexible method of loading data with embedded newlines. Using the STR attribute, I can specify a new end of line character (or sequence of characters). This allows you to create an input data file that has some special character at the end of each line ¿ the newline is no longer `special¿.

I prefer to use a sequence of characters, typically some special `marker¿, and then a newline. This makes it easy to see the end of line character when viewing the input data in a text editor or some utility, as each record still has a newline at the end of it. The STR attribute is specified in hexadecimal, and perhaps the easiest way to get the exact hexadecimal string you need is to use SQL and UTL_RAW to produce the hexadecimal string for us. For example, assuming you are on UNIX where the end of line marker is CHR(10) (line feed) and your special marker character is a pipe |, we can write this:

ops$tkyte@ORA10G> select utl_raw.cast_to_raw( '|'||chr(10) ) from dual;
 
UTL_RAW.CAST_TO_RAW('|'||CHR(10))
-------------------------------------------------------------------------------
7C0A


which shows us that the STR we need to use on UNIX would be X'7C0A'.

NOTE: On Windows, you would use UTL_RAW.CAST_TO_RAW( `|¿||chr(13)||chr(10) )
To use this, we might have a control file like this:

LOAD DATA
INFILE demo.dat "str X'7C0A'"
INTO TABLE DEPT
REPLACE
FIELDS TERMINATED BY ','
TRAILING NULLCOLS
(DEPTNO,
  DNAME        "upper(:dname)",
  LOC          "upper(:loc)",
  COMMENTS
)


So, if your input data looks like this:

[tkyte@desktop tkyte]$ cat demo.dat
10,Sales,Virginia,This is the Sales
Office in Virginia|
20,Accounting,Virginia,This is the Accounting
Office in Virginia|
30,Consulting,Virginia,This is the Consulting
Office in Virginia|
40,Finance,Virginia,This is the Finance
Office in Virginia|
[tkyte@desktop tkyte]$


where each record in the data file ended with a |\n, the above control file will load it correctly.

Embedded Newlines Wrap-Up


So, we have explored at least four ways to load data with embedded newlines. In the very next section, we will use one of these, the STR attribute, in a generic unload utility to avoid issues with regards to newlines in text.
Additionally, one thing to be very aware of, and I¿ve mentioned it above a couple of times, is that on Windows (all flavors), text files may end in \r\n (ASCII 13 + ASCII 10, carriage return/line feed). Your control file will have to accommodate this ¿ that \r is part of the record. The byte counts in the FIX and VAR, and the string used with STR must accommodate this. For example, if you took any of the above .DAT files that currently contain just \n in them and ftp¿ed them to Windows using an ASCII transfer (the default), every \n would turn into \r\n. The same control file that just worked in UNIX would not be able to load the data anymore. This is something you must be aware of, and take into consideration when setting up the control file.

</quote>

preserve previous record value if next value is null

May 16, 2011 - 1:24 am UTC

Reviewer: jafery from Pakistan

Hi Tom

I have a below comma seperated file

deptno,empno,ename
20,7022,BLAKE
 ,3432,MIC
30,2129,JIM

now as you can see the data above with first line as a header, and in the second ecord we dont have deptno. I want to preserve the previous value of deptno which is 20 and insert the data of 2nd record with deptno 20 instead of null. How can i do this in sqlldr/external table

Thanks

Tom Kyte

Followup  

May 18, 2011 - 3:08 am UTC

you would use "skip 1" in the external table definition.

and once you have the external table defined, you would:


select last_value(deptno ignore nulls) over (order by r), empno, ename
from (select rownum r, et.* from et)