Questions About External Files
Elie Grunhaus, December 08, 2002 - 4:17 pm UTC
Thanks very much for your quick reply.
Your comments and examples are very instructive,especially the link you included to the Oracle documentation on external file access. I see there several clauses that will be very useful for me.
A reader, December 11, 2002 - 4:10 pm UTC
Hi Tom,
I have created an external table and I am getting error while accessing it.
SQL> CREATE directory MY_DIR AS 'E:\ADMIN\';
Directory created.
SQL> grant all on directory my_dir to public;
Grant succeeded.
1 create table external_table
2 (EMPNO NUMBER(4) ,
3 ENAME VARCHAR2(10),
4 JOB VARCHAR2(9),
5 MGR NUMBER(4),
6 HIREDATE DATE,
7 SAL NUMBER(7, 2),
8 COMM NUMBER(7, 2),
9 DEPTNO NUMBER(2)
10 )
11 ORGANIZATION EXTERNAL
12 ( type oracle_loader
13 default directory MY_DIR
14 access parameters
15 ( fields terminated by ',' )
16 location ('emp.txt')
17* )
SQL> /
Table created.
SQL> select * from external_table;
select * from external_table
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-04040: file emp.txt in MY_DIR not found
ORA-06512: at "SYS.ORACLE_LOADER", line 14
ORA-06512: at line 1
-- I have EMP.TXT file on e:\admin dir. on server
Thanks,
December 11, 2002 - 10:32 pm UTC
Is e: a permanent drive on the server. (if it is a network mount, you are using windows, not as easy as unix with mounts)
Is e:\admin accesible to everyone (remember, oracle isn't running as YOU)
Is e:\admin\emp.txt accesible to everyone?
99999 times out of 100000 -- if the answers are no, yes, yes -- the file really isn't there, it is on the client -- check again.
A reader, December 12, 2002 - 8:26 am UTC
Hi Tom,
e: a permanent drive on the server
----------------------------------
-- e: is Removable disk drive(External) (We have test PC here in our lab and they don't have enough space so I added extrnal disk on one of the NT workstation and created drive E:\)
Is e:\admin accesible to everyone
---------------------------------
-- About four people are using this instance from their machine.We have given all rights on E:\ to all the people.
Is e:\admin\emp.txt accesible to everyone?
------------------------------------------
-- Since I gave right on E:\ to all It is accesible to everybody. Everybody mapped drive e:\ on their machine AND they are able to do all operation(-rwx) to this drive.
Thanks,
December 12, 2002 - 10:07 am UTC
and if you use a permanent drive, in order to test the theory that maybe this removable media is not available to the services that start on boot up -- what happens then.
See
</code>
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:241814624807 <code>
for where I'm going here -- that the removable media is like a network drive and not universally accessible to the services.
Test by using a real disk.
fields are not terminated
Pushparaj, September 25, 2003 - 4:15 pm UTC
Tom,
I have a data file and the fields are not terminated
by anything. It is basically terminated by the field position. What should I provide in the access parameters
list.
For example,
My datafile is
11SCOTT10
12SAMMY20
empno position (1,2)
ename position (3,7)
deptno position (8,9)
CREATE TABLE emp_ext
( empno number(2),
ename char(5),
deptno number(2)
)
organization external
( type oracle_loader
default directory DATA_DIR
access parameters
(
fields terminated by ????????
)
location ('emp.dat')
)
/
September 25, 2003 - 11:33 pm UTC
don't use terminated by, use POSTITION(n:m) instead
This is great! BUT how do I do this?
chneih, October 21, 2003 - 10:49 am UTC
I have a CSV datafile optionally enclosed by '"' but with comma inside the enclosed '"' field.
For example
AcctNO Name JobTitle Age
123,"Schmoe, Joe","Mechaic",23
456,"David","Manager",45
789,"Blaine, David, E","",20
Thanks,
Chneih
October 21, 2003 - 5:09 pm UTC
tis ok, that is what the enclosed by is about
Question
chneih, October 22, 2003 - 11:05 am UTC
How do I load everything to a 4 column table. Right now I'm using
fields delimited by ',' optionally enclosed by '"'
AcctNO Name JobTitle Age
123,"Schmoe, Joe","Mechaic",23
456,"David","Manager",45
789,"Blaine, David, E","",20
Thanks again
October 22, 2003 - 6:03 pm UTC
ops$tkyte@ORA920> desc t
Name Null? Type
----------------------------------- -------- ------------------------
A NUMBER(38)
B VARCHAR2(25)
C VARCHAR2(15)
D NUMBER(38)
ops$tkyte@ORA920> !cat t.ctl
LOAD DATA
INFILE *
INTO TABLE T
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
(a,b,c,d)
BEGINDATA
123,"Schmoe, Joe","Mechaic",23
456,"David","Manager",45
789,"Blaine, David, E","",20
ops$tkyte@ORA920> !sqlldr / t.ctl
SQL*Loader: Release 9.2.0.3.0 - Production on Wed Oct 22 18:05:09 2003
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Commit point reached - logical record count 3
1* select * from t
ops$tkyte@ORA920> /
A B C D
---------- ------------------------- --------------- ----------
123 Schmoe, Joe Mechaic 23
456 David Manager 45
789 Blaine, David, E 20
Quick question about utl_file.fgetattr
Jimmy, February 05, 2004 - 5:25 am UTC
Hey Tom!
Just a quick question. When loading data into my database via External Tables there is sometimes a External_Table.bad file that is created in the DATA_DIR.
This is fine but is there any way of using utl_file.fgetattr to scan if the file exists? Normally I would do something like:
l_filename := 'EXTERNAL_TABLE.bad';
utl_file.fgetattr('DATA_DIR',l_filename,l_exists,l_size,l_block_size);
But the problem i am having is that the data loading from the external table is done monthly and the latter part of the .bad file changes depending on when the procedure is run. Is there any way of using a
LIKE(EXTERNAL_TABLE_%.bad) to pick up the .bad file from the DATA_DIR?
searching for any .bad file will not do because i have multiple external tables each potentailly creating a .bad file.
Any help would be most appreciated!
Thanks
February 05, 2004 - 7:42 am UTC
what is the format you used in the create table statement to name the bad file? did you use a format?
Jimmy, February 05, 2004 - 9:41 am UTC
Not at first and this was the underlying reason why i needed to search the filename of the bad file within the DATA_DIR.
I think that i am gonna have to define a fixed format for each bad file within my create external tables scripts thus avoiding the problem.
Thanks for your help though
Regards
Accessing a flat file stored in XDB
Dave, February 05, 2004 - 3:09 pm UTC
I have recently been exploring the use of XDB, specifically using XDB as an ftp server.
So people/systems can ftp files to Oracle, through the listener, and they are stored directly in the database. Now my question,and it may be a rather broad one I'm afraid, is this:- since this file is now within the database, is there a way for me to ...
i) find it
ii) read the data in it
... through SQL?
I suspect that there's some XML-based way of doing this, but I've been a relational database guy for a long time and the XML stuff is an intimidating learning curve. Obviously I'd like to be able to accept files through ftp and access them directly without writing them out to the o/s and reading them in again.
Does this seem achievable?
February 06, 2004 - 2:15 pm UTC
I asked Sean Dillon, our XML technologist, to take a look at this and here's what he had to say:
-----
Hi Dave,
Yes, absolutely. There are a couple of ways to do what you want to do, and the good news is they've all been documented/sampled/explained time and again. I'll explain some of these things and then point you to some resources on OTN that you should read that will tell you what you want to know.
First... how do you FIND your XML files in the XML DB repository? Well, that's pretty simple. With XML DB comes a couple of views you'll want to familiarize yourself with. These are RESOURCE_VIEW and PATH_VIEW. These views provide you a mechanism to access data stored in the XML Repository from SQL.
RESOURCE_VIEW:
Column Datatype Description
------ -------- -------------------------------------------------------
RES XMLTYPE A resource in Oracle XML repository
ANY_PATH VARCHAR2 A path used to access the resource in the repository
PATH_VIEW:
Column Datatype Description
------ -------- -----------------------------
PATH VARCHAR2 Path name of a resource
RES XMLTYPE The resource referred by PATH
LINK XMLTYPE Link property
If you look in the XML DB Developer's Guide, chapter 18 is entitled SQL Access Using RESOURCE_VIEW and PATH_VIEW. You can find the documentation on OTN here: </code>
https://docs.oracle.com
Go to the bottom of that page, get the XML DB Developer's Guide.
For accessing the data, there are a number of ways to do this. If you load in arbitrary XML (i.e., XML that does NOT use a database-registered XML Schema, the only option you have is through XMLType access methods (because essentially, that's what you've loaded into the repository).
There have been quite a few articles on this subject in Oracle Magazine, I'd suggest reading those and doing the examples to get a better understanding of how you can access the XML data stored in the repository. If after that you still have questions, come back to AskTom and fire away ;-) (i.e., no need to rewrite the article's contents here ;-)). Check out these articles:
https://docs.oracle.com https://asktom.oracle.com/Misc/oramag/on-procedures-flushes-and-writes.html <code>
Hope that helps!
_smd_
External tables or ETL tool
A reader, February 07, 2004 - 12:07 am UTC
Hi Tom,
There is a requirement to create logical record from ASCII input files (total 10). The input files size vary from 1GB to 5GB. The relationships between various input ASCII files are known. The final o/p is required in ASCII file. We also have third party ETL tool. Initially, I was thinking of using ETL tool, but on reading your new book (page 411), I am thinking of using external tables (we have Oracle9i - 9.2). Could you please give your advice with the approach below:
1. Create external tables with for each input file, and then create tables (as per example on page 411) in parallel.
2. Create required indices
3. Join various tables created in step 1, and then create ASCII file through sqlplus.
The advantage of ETL tools is I need not to create tables ie., I can join ASCII files, and finally create ASCII file. But I am not sure about performance. With external tables approach, even creating tables, and then creating indices , and finally spool the file, will also have performance impact. Further, I noticed before (in older versions of oracle) that there was some limitation in o/p file size produced through sqlplus. I am expecting o/p file size in 10 GB (records are formed from various fields of input files). Is there maximum file size limit for the file to be produced through sqlplus in 9.2 (64bit)?
Thanks
February 07, 2004 - 2:39 pm UTC
well, there is 0% chance I would even remotely consider using sqlplus for a 10gig output file.
I might use my array_flat program and just run a query that queries the external tables directly to produce the resulting output file. Just configure sufficient TEMP space, pga_aggregate_target (or set big sort_area/hash_area sizes) and let it rip. It could be as easy as 5 create tables (with dbms_stats.set_table_stats to give us a tiny bit of info) and a query. (don't even bother indexing, just do a big bulk query and let it rip).
</code>
http://asktom.oracle.com/~tkyte/flat/index.html <code>
look for the pro*c program.
Dave, February 08, 2004 - 10:51 am UTC
Thanks Sean (and Tom), those are great references to start with.
As an architectural followup, is the inclusion of web-server capabilities in the database product a sign that Oracle is aiming to move away from Apache for some products? How "well-featured" do you regard Oracle as an HTTP server currently?
Thanks again.
February 08, 2004 - 8:56 pm UTC
Not a chance -- apache is part and parcel of the solution -- still there in 10g.
HTTP is a very very simple protocol when used simply. apache offers lots more (the whole module concept and all of the existing modules). Setting up http services is pretty easy -- doesn't require apache in all cases. apache is a web server -- we are not trying to take over that entire realm there.
Another question on external files.
Matt, February 10, 2004 - 7:25 pm UTC
I am working on Windows and have to load UNIX base files.
I have been using the following access parameters:
records delimited by newline
Obviously on windows this is "\r\n". The UNIX files are "\n". I am currently getting the message:
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-29400: data cartridge error
KUP-04020: found record longer than buffer size supported, 524288, in D:\GRMS
Migration Letterbox\REMCODEV\inbox\regdata.csv
ORA-06512: at "SYS.ORACLE_LOADER", line 14
ORA-06512: at line 1
Looking at the utilities doco I understand that I can use hex values. I have tried the following:
records delimited by "OA"
records delimited by "OD"
and I get the error;
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-29400: data cartridge error
KUP-04020: found record longer than buffer size supported, 524288, in D:\GRMS
Migration Letterbox\REMCODEV\inbox\regdata.csv
ORA-06512: at "SYS.ORACLE_LOADER", line 14
ORA-06512: at line 1
What am I doing wrong?
Best Regards,
February 11, 2004 - 8:50 am UTC
can you ftp the files to your machine to get the newlines correct? ftp from unix to windows will accomodate for the fact that windows wants 2 things to end a line.
Short of that -- not having a windows machine to try anything with and not seeing the entire example, I cannot really comment further right now.
An example
Matt, February 16, 2004 - 2:24 am UTC
Tom,
According to
</code>
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96652/ch12.htm#1007431 <code>
If I am reading the doco correctly, I should be able to use 'delimited by "OA"' to enable me to access a unix file on a windows platform.
Can you please provide a quick and dirty example of this syntax working - it doesn't for me. I have a call open with Oracle support and am not getting a very good response.
If you can please just show me a unix file on a unix box (if that is all you have) being accessed as an external table using the "delimited by "<hex value here>"" syntax it would really be appreciated.
Thanks.
February 16, 2004 - 10:03 am UTC
Ok, so I install 9ir2 on my windoze VM (vmware -- very cool, lets me runs windows ironically enough in a window. perfection).....
created a flat file in unix, binary ftp it. then:
CREATE TABLE ext_table
(
EMPNO NUMBER(4),
ENAME VARCHAR2(10),
JOB VARCHAR2(9),
MGR NUMBER(4),
HIREDATE DATE,
SAL NUMBER(7,2),
COMM NUMBER(7,2),
DEPTNO varchar2(4000)
)
ORGANIZATION external
(
TYPE oracle_loader
DEFAULT DIRECTORY MYFILES
ACCESS PARAMETERS
(
RECORDS DELIMITED BY 0x'0a' CHARACTERSET WE8ISO8859P1
FIELDS TERMINATED BY "|" LDRTRIM
REJECT ROWS WITH ALL NULL FIELDS
(
EMPNO CHAR(255)
TERMINATED BY "|",
ENAME CHAR(255)
TERMINATED BY "|",
JOB CHAR(255)
TERMINATED BY "|",
MGR CHAR(255)
TERMINATED BY "|",
HIREDATE CHAR(255)
TERMINATED BY "|",
SAL CHAR(255)
TERMINATED BY "|",
COMM CHAR(255)
TERMINATED BY "|",
DEPTNO CHAR(255)
TERMINATED BY "|"
)
)
location
(
'emp.dat'
)
)REJECT LIMIT UNLIMITED
/
loads it - the trick is 0x'0a'
^^
Excellent - thanks!
Matt, February 16, 2004 - 4:55 pm UTC
I think that the doco that I have read is a little thin on the ground on this part of the syntax - maybe it has improved in 10G?
Thanks for the quick turnaround - questions like these seem to take an age for Oracle Support to respond to accurately.
How do you find the VMWare - does it work well? I plan to use it to get a 10G install on Linux working on my Win2K machine.
Thanks again.
February 17, 2004 - 7:56 am UTC
you got that backwards. You should use it to get 10g under windows running on your linux machine! windows in a window is such a nice thing.
I find vmware to be pretty stable -- use it to access AOL and other "we only do windoze" software when needed. The database is fully supported under vmware as well.
A reader, March 04, 2004 - 10:31 pm UTC
Tom,
Lets say if I have text file which is delimited by '|' and ','
Can you please provide me an example by which i can take care of both of these delimiters by creating one external table.
Thanks.
March 05, 2004 - 7:53 am UTC
it doesn't do delimited by this that or the other thing. only "delimited by this"
A reader, March 05, 2004 - 9:01 am UTC
Can I do this in sqlldr then?
Thought about preprocessing?
Tanja Wilson, March 05, 2004 - 11:55 am UTC
Depending upon file size, it might be faster to preprocess the file and change the "|" to a "," or vice versa, so you have a file delimited by only one thing.
We use sed (a basic Unix command) to do simple things like this to "clean up" files before passing them to sqlldr / external tables.
location of files
reader, March 07, 2004 - 4:04 pm UTC
Should the flat files locate on the serever where the databse is if I use external tables? Thanks.
March 07, 2004 - 4:48 pm UTC
yes, they must be accessible to that server.
A reader, March 08, 2004 - 9:43 am UTC
Thanks Tom and Tanja..
Any limitations on the number of external tables you can create
Grant in Calgary, March 25, 2004 - 6:02 pm UTC
Hi Tom,
I have recently built procedure that uses dynamic sql to generate an external table with a unique_name based on the business file type and loads the data into the system. This is an operation that occurs frequently every day and includes files that have sizes exceeding 100mb. The end users that use the apps that call this procedure also have the ability to "correct" the data after it has been loaded into the system and occasionally make the wrong correction. Because of this, our old system use to involve tracking down the load file and finding out what the original values looked like to find the errors.
(Sorry for being so long winded!)
With the new system, I would like to maintain the external tables for 6 months to a year, in order to build queries that compare the data in the file to what exists in the system. The only thing that I can think of with this approach is that it may mean there could be thousands or 10's of thousand of these external tables.
I wanted to know if there is a downside to maintaining all of these external tables or any limitations in oracle as to how many external table definitions the database can maintain at one time.
If there is, I can build into the procedure, the functionality necessary to create and drop the external table whenever I access it. Is this necessary?
Thanks Again Tom in advance
March 25, 2004 - 9:31 pm UTC
you would just be having a really large data dictionary, on the size of an Oracle Apps/peoplesoft/SAP one :)
Queries that hit the data dictionary (monitoring tools, guis mostly) might be affected, but should be a serious issue in most cases -- it wouldn't even affect it as much as a "real" table since it won't be filling up the extents views like real ones would..
External table and tab delimited data file
SAJ, April 07, 2004 - 2:31 pm UTC
I have a problem with external table and tab delimited text file. Specially, when there are fields in the data that are null. Can you help please?
Here are the scripts
drop TABLE testload
CREATE TABLE testload (
deptno NUMBER(6),
dname VARCHAR2(20),
loc VARCHAR2(25)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY GMF_DATA_AU
ACCESS PARAMETERS
( FIELDS TERMINATED BY '\t'
OPTIONALLY ENCLOSED BY '"'
missing field values are null
(
deptno integer EXTERNAL(6),
dname CHAR(20),
loc CHAR(25)
)
)
LOCATION ('testload.txt')
)
REJECT LIMIT UNLIMITED
LOGGING
NOCACHE
NOPARALLEL;
The data file is below
12 SARATOGA
10 ACCOUNTING CLEVELAND
11 ART SALEM
13 FINANCE BOSTON
21 SALES PHILA.
22 SALES ROCHESTER
42 INT'L SAN FRAN
With the above script, when I select from the external table
this is what i get
SELECT * FROM testload
deptno dname loc
-----------------------------
12 SARATOGA
10 ACCOUNTING CLEVELAND
11 ART SALEM
13 FINANCE BOSTON
21 SALES PHILA.
22 SALES ROCHESTER
42 INT'L SAN FRAN
April 08, 2004 - 9:37 am UTC
that is the way it works -- when it sees multiple 'termination' characters, it eats them all up.
My first thought is -- you have SQL, lets use it:
ops$tkyte@ORA9IR2> CREATE TABLE "TESTLOAD"
2 (
3 DATA VARCHAR2(4000)
4 )
5 ORGANIZATION external
6 (
7 TYPE oracle_loader
8 DEFAULT DIRECTORY TEST
9 ACCESS PARAMETERS
10 (
11 RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8ISO8859P1
12 BADFILE 'testload.bad'
13 LOGFILE 'test.log_xt'
14 READSIZE 1048576
15 FIELDS LDRTRIM
16 REJECT ROWS WITH ALL NULL FIELDS
17 (
18 "DATA" CHAR(4000)
19 )
20 )
21 location
22 (
23 'testload.txt'
24 )
25 )REJECT LIMIT UNLIMITED
26 /
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select * from testload;
DATA
------------------------------------------------------------------------
12 SARATOGA
10 ACCOUNTING CLEVELAND
11 ART SALEM
13 FINANCE BOSTON
21 SALES PHILA.
22 SALES ROCHESTER
42 INT'L SAN FRAN
7 rows selected.
ops$tkyte@ORA9IR2> column c1 format a15
ops$tkyte@ORA9IR2> column c2 format a15
ops$tkyte@ORA9IR2> column c3 format a15
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select substr( data, tab0+1, tab1-tab0-1 ) c1,
2 substr( data, tab1+1, tab2-tab1-1 ) c2,
3 substr( data, tab2+1, tab3-tab2-1 ) c3
4 from (
5 select data,
6 0 tab0,
7 instr(data,chr(9),1,1) tab1,
8 instr(data,chr(9),1,2) tab2,
9 length(data)+1 tab3
10 from testload
11 )
12 /
C1 C2 C3
--------------- --------------- ---------------
12 SARATOGA
10 ACCOUNTING CLEVELAND
11 ART SALEM
13 FINANCE BOSTON
21 SALES PHILA.
22 SALES ROCHESTER
42 INT'L SAN FRAN
7 rows selected.
use that last query as a view of testload and you have what you want. as you can see, it is easy to generalize this to as many columns as you like.
Thanks. It works like a charm!
SAJ, April 08, 2004 - 5:21 pm UTC
Not that I doubted you! You always come up with the most elegant solutions
But, I still don't understand why SQL*Loader behaves differently than External Table. Doesn't External table creation use SQL*Loader
I created a control file for the same data file and used SQL*loader for the test run and that loaded the file correctly into its respective columns
April 08, 2004 - 5:41 pm UTC
sqlldr does the same thing with that. I have that documented in my book "expert one on one Oracle".
it didn't load the way you describe -- two consecutive tabs in sqlldr would do the SAME thing (in sqlldr, it rejected the row unless I used trailing nullcols in fact)
Help Wanted!!!!!!!!!!!
A reader, June 22, 2004 - 4:41 pm UTC
Hi Tom,
I am using external tables. But I am getting the following error:
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-04063: unable to open log file ext_vende_1012_1032.log
OS error The system cannot find the file specified.
ORA-06512: at "SYS.ORACLE_LOADER", line 14
ORA-06512: at line 1
The code is as follows:
drop table ext_VENDEDORES;
CREATE TABLE ext_VENDEDORES
(
VENDID NUMBER(10) ,
CODIGO VARCHAR2(10) ,
EMPRESA VARCHAR2(30) ,
NOMBRE VARCHAR2(80) ,
IDENTIFI NUMBER(10) ,
DIRECCION VARCHAR2(80) ,
TELEFONO NUMBER(10) ,
PROMOTOR NUMBER(1) ,
FECHA DATE ,
ACOPIO VARCHAR2(15) ,
ESTADO VARCHAR2(15)
)
organization external
(type oracle_loader
default directory my_dir
access parameters
(records delimited by newline
badfile my_dir:'ext_vende%a_%p.bad'
logfile my_dir:'ext_vende%a_%p.log'
fields terminated by ','
missing field values are null
(
VENDID ,
CODIGO ,
EMPRESA,
NOMBRE ,
IDENTIFI ,
DIRECCION,
TELEFONO ,
PROMOTOR ,
FECHA char date_format date mask "dd/mm/yyyy hh24:mi:ss",
ACOPIO,
ESTADO
)
)
location ('vendedores.txt')
)
reject limit unlimited;
I created the directory as my_dir on the associated path and copied the .txt data file to that directory.
The concerned user was granted read-write privileges on that directory. Also, I verified the existence of the directory from all_directories dictionary.
I also tried by copying and changing the location from 'vendedores.txt' to 'c:/vendedores.txt'
But it generated the same error. It creates the external table but generates the error when I do a select * from ext_vendedores;
Please help me resolve this issue..
Thanks as always
June 22, 2004 - 10:17 pm UTC
the directory must exist on the SERVER.
the concerned user is the "oracle software owner" as far as the OS is concerned.
oracle must have read write access to this directory
and the directory must exist on the SERVER (database server) itself.
Disable log
Jan, June 23, 2004 - 5:37 am UTC
Everytime I select from an external table, the log file "TABLENAME_SESSIONID.LOG" is generated. How can I disable this?
Thanks
June 23, 2004 - 9:20 am UTC
Bad file
A reader, June 23, 2004 - 9:04 pm UTC
I have a lot of SQLLDR scripts to load data into staging tables and then upsert into main tables. The new external tables and MERGE features would be really useful here.
But with SQLLDR, which is invoked from Unix shell scripts, I usually look at the return code ($?) from 'sqlldr', do a 'wc -l' on the bad file, to determine success/failure, etc
So, when using external tables, I would need to create a bad file to see if the load succeeded or not, do a count(*) (similar to wc -l badfile.bad), etc.
Since the bad file has exactly the same format as my data file, instead of duplicating the DDL for the bad file/table, can I do the usual trick we do to copy a table's definition like
create table my_bad_table as select * from ext_table where 1=2;
June 24, 2004 - 9:16 am UTC
well, if the DDL for the original input wasn't able to be processed -- why would copying it to a "bad table" definition work???
the bad table would just reject all of the records!
your bad table would be something so simple the records could not be rejected, like:
ops$tkyte@ORA9IR2> create table t
2 ( data varchar2(4000) )
3 organization external
4 ( type oracle_loader default directory data_dir
5 access parameters
6 ( records delimited by newline fields ( "DATA" char(4000) ) )
7 location ('test.dat' )
8 )
9 /
if your rowsize exceeds 4000 characters, you would just use multiple fields.
to find out how many rejects -- "select count(*) from t", to see the rejects -- select * from t;
A reader, June 24, 2004 - 3:41 pm UTC
No, what I meant was...
In the SQLLDR world, the bad file has exactly the same format as my input data file. I dont have to do anything for that, thats just the way SQLLDR works.
In contrast, if I want to trap bad records in the external table world, I have to specifically create the DDL for the bad table. This just seems like extra work to me!
Alternatively, instead of
access parameters
(records delimited by newline
badfile external_tables_dir:foobar
I would have liked to do
access parameters
(records delimited by newline
badTABLE foobar)
and then Oracle would automagically create the badfile and the corresponding external table to access that for me, in one swoop.
I know, I know, all these new features are making us lazy!
Performance Issues with External Tables
External Table user, June 25, 2004 - 9:45 am UTC
Hi,
I am using external tables to create a database. Data to be loaded is in the form of flat files. These files have been used as external tables, and there are PL/SQL procedures to transform them, and load them into database tables.
Recently, there have been performance issues with the databases(The UNIX Server hosts many databases). The DBA's have come to the conclusion that it might be caused due to the external tables. (This is a development database, and there are only around 12 external tables, with each file less than 10 KB. The external tables are referenced only for loading data. There are only a couple of joins in one of the procedures.)Do you this this conclusion is right? Can you throw some light on the issue?
Thanks
June 25, 2004 - 2:36 pm UTC
ROTFL
"The DBA's have come to the conclusion that it might be
caused due to the external tables. "
How's about THEY throw a little (just a little) light on the issue and explain how they came to a conclusion that it MIGHT be something....
Hmmm. It might be the alignment of mecury with saturn too -- or sun spots.
But, without any sort of analyis or facts to back it up, we cannot really be sure.
Data File In Application Server Can we USE EXTERNAL TABLE ??
Arindam, July 02, 2004 - 10:54 am UTC
Tom,
Thank for answering our questions (sometimes as stupid as mine's) day in and day out.
We have a application server which has a Oracle client installed , and has the data residing in it. We usually use SQL Loader to push the data from App Server to DB server. Can we use external table here ? I know we need to create a directory prior but we don't have any links(soft or hard) between those servers. Could you comment on this ?
Regards
July 02, 2004 - 11:12 am UTC
the file must be accessible the DATABASE SERVER as your dedicated/shared server is the process that actually reads it. It could be an NFS mount.
Error checking, lookups, etc
A reader, July 13, 2004 - 10:16 am UTC
I have many processes that get a file from a external vendor, load it into a staging table, run PL/SQL code to process each row and merge it into the main table.
When we upgrade to 9iR2, I would like to use the external tables feature for any new files we get in the future.
But I see the following issues that I would appreciate your help with.
1. The vendor sends us their primary keys (cust_no, account_no, etc). The PL/SQL looks these up into our tables and translates this into our internal ids.
2. The current process looks at the exit code from 'sqlldr' and/or the number of rows in the bad file and throws an error if either of them are not good.
3. The current process loops thru all the rows in the staging table, does all the validations and if any record fails validation, it writes that to an error table and a post-load sqlplus script queries this error table and generates a report.
All these things seem more suited towards a procedural, row-by-row approach rather than a set-based external table approach.
Can you please address the above issues and advise if it makes sense to convert these kinds of processes to use external tables?
Thanks
July 13, 2004 - 12:18 pm UTC
it absolutely makes sense.
1) scalar subqueries are awesome for automating that. you can select your lookup right in the query.
2) reject = 0, the queries against external tables will get an error, same difference.
3) if you need row by row validation -- you would BULK collect 100 or so rows, and then forall i insert them after "cleansing them"
Error checking, lookups, etc
A reader, July 13, 2004 - 1:10 pm UTC
Not sure if I understand you, lets take an example and work with it.
create table t
(
our_cust_no int,
their_cust_no int,
cust_name varchar2(50),
state_desc varchar2(50)
)
insert into t values (100,1,'First',null);
insert into t values (101,2,'Second',null);
My external file is defined as
1,First modified,NJ
2,Second modified,NY
3,Doesnt exist,PA
4,Bad record,ZZ
The first and second records should update my table with the modified cust_names and the expanded state descriptions (New Jersey and New York)
The 3rd record should get inserted as
(103,3,'Doesnt Exist','Pennsylvania) --The 103 here is just the max(our_cust_no) from t + 1
The 4th record should write to an error table since it has a bad state code (ZZ) or some other invalid data.
Given this setup, how can I apply your 3 suggestions?
Thanks
July 13, 2004 - 1:50 pm UTC
you'll turn a "load file into stage table, read record by record, process" into
o open a cursor on the external table -- perhaps outer joining that to the "current" table so you'll know if a record is to be updated or inserted....
o this cursor can also use a join or scalar subquery to turn NJ into New Jersey (or null if ZZ is found)
o you bulk fetch 100 or so of these.
o now you fill up three arrays
array 1 -- good records to be inserted
array 2 -- good records to be updated
array 3 -- bad records to be logged
and you forall i process them. You skip the sqlldr, you change the logic to be BULK.
Error checking, lookups, etc
A reader, July 13, 2004 - 6:44 pm UTC
This is my first time using external tables, so please bear with me, but I still havent quite got it.
Using my earlier example, how would this go?
select col1,col2,
(select state_desc from lookup
where state_code=a.state_code)
bulk collect into ...
from external_table a,current_table b
where a.pk=b.pk(+)
Now, depending on my validations and lookups, I fill up the 3 arrays you suggest and then forall i bulk insert/update/log the records for EACH of the 3 arrays.
Questions:
1. If I set my REJECT LIMIT to 0 for the external table and the access driver rejects a row, what would happen? The entire SELECT BULK COLLECT above would fail and throw an exception, right? How do I handle this? I would want to just log that bad row into a bad table and proceed, not abort the entire bulk collect
2. From an ease-of-use perspective, how in the world is all this better?! I mean, even my existing load into staging, process row by row logic, I can just replace external_table with staging_table in my code above and still get the advantages of bulk processing, right? So, what exactly is the external table feature buying me here?
'select * from' a flat_file sounds very cool but if you want all the robust error-checking, logging, validations, lookups, etc, typically associated with load-into-staging-and-process jobs, external tables dont really add much value, do they?
Thanks
July 13, 2004 - 8:16 pm UTC
1) a fetch would fail -- yes. Your logic said you wanted to stop above:
<quote>
2. The current process looks at the exit code from 'sqlldr' and/or the number of
rows in the bad file and throws an error if either of them are not good.
</quote>
the reading of the external table is like the sqlldr, i assumed you wanted to stop. if not, set rejects = unlimited.
2) everything is in the database, on process, one procedure, less work for the database -- no load into stage, read stage. just read/process file.
Just even using external tables as "sqlldr" adds a ton -- even if you did not change the process, just dropped the shell scripts you've bought alot.
heck, if you want to -- you can:
</code>
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:19481671347143 <code>
whereby you put the validation code into a pipelined function -- route bad records to a bad table, output good records after doing whatever lookups you wanted -- and use merge to make your entire etl become:
merge into existing
using ( select * from table( pipelined_function ) ) et
on ( ... )
when matched then update...
when not matched then insert ...
Correction
A reader, July 13, 2004 - 7:18 pm UTC
Sorry, I contradicted myself. If I set reject limit=0 and the bulk collect fails because a row was rejected, I indeed want the entire thing to fail (similar to how I get a non-zero return code $? from sqlldr in my current process)
But I would appreciate your comments on my code and my other observations.
Thanks
Bulk fetch
A reader, July 13, 2004 - 7:21 pm UTC
1. So I bulk fetch from my external_table cursor into one array
2. Loop thru the array and determine insert/update/bad and populate the corresponding array
3. forall i EACH array and insert/update my final table and bad table
And all this is faster than just processing the external_table cursor one row at a time?
Thanks
July 13, 2004 - 8:17 pm UTC
bulk is better, yes.
Awesome!
A reader, July 13, 2004 - 8:29 pm UTC
Sequence
A reader, July 14, 2004 - 5:12 pm UTC
Another question:
How can I make it such that all the rows loaded from the exteral table have the same sequence number which is one more than the max(seq_no) in my real table?
Thanks
July 15, 2004 - 11:46 am UTC
insert into your_table
select (select max(seq_no)+1 from your_table), et.* from external_table et;
More questions
A reader, July 14, 2004 - 6:55 pm UTC
Some more questions about using external tables:
1. I dont seem to have much flexibility in naming my logfile. Is there a way to have some dynamic file name generation based on the time of day, input parameters passed into the stored proc, etc?
2. Similar to UTL_FILE_DIR, the directory specified by CREATE DIRECTORY seems to refer to only that specific directory, not subdirectories under it. Is there a way to grant access to an entire directory tree?
3. SQLLDR has the ability to load multiple tables using various INTO TABLE and WHEN clauses. How can I do this using external tables? I guess I can use the INSERT ALL (multi-table insert) feature? How would this work in conjunction with the LOAD WHEN clause of the external table's access parameters?
I mean, suppose I have
INSERT ALL
when ... then into table ...
when ... then into table ...
select ... from external_table;
I guess it makes sense to NOT have any LOAD WHEN in the external table definition and have the WHEN clauses in the INSERT ALL control that, right?
Thanks a lot
July 15, 2004 - 11:57 am UTC
1) there are modifiers %p and %a you can include in there and utl_file can be used to rename the log file to whatever you like after you are done with it.
2) no subdirectories, you must be explicit.
3) multi-table inserts. you have a where clause -- infinitely more powerful than anything you could do in sqlldr
i would just use "where" -- use the load when to exclude records that should never be seen.
Dates
A reader, July 15, 2004 - 12:10 pm UTC
Thanks a lot, another question, if you dont mind.
I have a pipe-delimited file with 100 fields, 2 of which are dates.
I did
create table ext
(
f1 varchar2(10),
f2 number,
...
f99 date,
f100 date
)
organization external
(type oracle_loader
default directory my_dir
access parameters
(
records delimited by newline
fields terminated by '|'
)
location ('test.dat')
);
Now my date field is in YYYYMMDDHH24MISS format.
Yes, I can specify all the fields again in the access parameters section and use the CHAR DATE_FORMAT DATE MASK ... for these 2 fields.
But is there a way to specify the date format just for these 2 fields and default the rest? I hate to duplicate all the other 98 field definitions!
Thanks
July 15, 2004 - 1:32 pm UTC
you can either
o map them as a varchar2 and just to_date them in the select
o just list the other column names a,b,c,d,e,f,..... and just give the datatype spec for these two.
Skipping first and last records
A reader, July 15, 2004 - 12:28 pm UTC
How can I skip loading the first and last records in the file?
SKIP 1 will skip the first but how can I skip the last?
Thanks
July 15, 2004 - 5:49 pm UTC
there would have to be something that denotes the "last record", a where clause would filter it -- although we could use rownum and a count(*) over () to skip it as well (less efficiently)
Rename
A reader, July 15, 2004 - 1:22 pm UTC
You said
"there are modifiers %p and %a you can include in there and utl_file can be used to rename the log file to whatever you like after you are done with it"
If the logfile has the %p modifier in it, the filename is dynamic. How can I use utl_file.frename to rename the file if I dont know the name?!
Thanks
July 15, 2004 - 6:03 pm UTC
%p is your process id, just query it up.
ops$tkyte@ORA9IR2> create table external_table
2 (EMPNO NUMBER(4) ,
3 ENAME VARCHAR2(10),
4 JOB VARCHAR2(9),
5 MGR NUMBER(4),
6 HIREDATE DATE,
7 SAL NUMBER(7, 2),
8 COMM NUMBER(7, 2),
9 DEPTNO NUMBER(2)
10 )
11 ORGANIZATION EXTERNAL
12 ( type oracle_loader
13 default directory data_dir
14 access parameters
15 (
16 RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8ISO8859P1
17 logfile 'foo_%p.log'
18 fields terminated by ','
19 )
20 location ('emp.dat')
21 )
22 /
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set linesize 105
ops$tkyte@ORA9IR2> set echo on
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> column log new_val log
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select '/tmp/foo_' || a.spid || '.log' log
2 from v$process a, v$session b
3 where a.addr = b.paddr
4 and b.sid = (select sid from v$mystat where rownum=1)
5 /
LOG
-------------------------
/tmp/foo_15741.log
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> !ls -l &LOG
ls: /tmp/foo_15741.log: No such file or directory
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select empno from external_table where rownum = 1
2 /
EMPNO
----------
7369
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> !ls -l &LOG
-rw-r--r-- 1 ora9ir2 ora9ir2 1137 Jul 15 18:00 /tmp/foo_15741.log
Logging exceptions
A reader, July 15, 2004 - 4:47 pm UTC
After reviewing all your suggestions in the context of my environment, I have the following so far.
declare
type my_tab_t is table of my_table%rowtype index by binary_integer;
my_tab my_tab_t;
errors exception;
PRAGMA exception_init(errors, -24381);
begin
select
seq.nextval,
(select nvl(max(load_seq_no),0)+1 from my_table),
(select our_cust_no from cust where their_cust_no=ext.their_cust_no),
...
BULK COLLECT INTO
my_tab
from external_table ext;
-- do some more validations
begin
forall i in 1..my_tab.count SAVE EXCEPTIONS
insert into my_table values my_tab(i);
when errors then
...
end;
Now, in my WHEN ERRORS section above, I need to log the 'their_cust_no' field from my vendor's file since we communicate with them on that key. But since I have translated it to our_cust_no in my scalar subquery, it is lost!
How can I get it back?
Yes, instead of defining a %ROWTYPE table, I can create a record with all the columns and add this additional column, but then I lose the ability to simply
insert into table values record; -- this is really cool
Is there a more elegant solution?
Thanks
July 15, 2004 - 6:38 pm UTC
translate it back when you log the error??
A reader, July 15, 2004 - 6:45 pm UTC
"translate it back when you log the error??"
um, maybe I didnt explain it clearly. I cant translate it back because I dont have anything to translate from, thats the point.
See the file has their_cust_no and I translate this using a scalar subquery into our_cust_no. My target table has a NOT NULL and a FK on our_cust_no. So if the lookup fails, my bulk collect into (and hence my sql%bulk_exceptions) has NULL. In this case, I want to get my our_cust_no back.
How can I do this?
Thanks
July 15, 2004 - 6:52 pm UTC
umm, if you can do this:
(select our_cust_no from cust where their_cust_no=ext.their_cust_no),
you can do this:
(select their_cust_no from cust where our_cust_no=ERROR_RECORD.cust_no),
you appear to have a lookup table that goes back and forth no????
A reader, July 15, 2004 - 6:46 pm UTC
And in any case, to translate it back, I would have to query the external table again i.e. launch the loader process again. This would be expensive.
I was hoping there would be a PL/SQL way to do this. Thanks
July 15, 2004 - 6:52 pm UTC
see above.
A reader, July 15, 2004 - 8:39 pm UTC
<quote>
umm, if you can do this:
(select our_cust_no from cust where their_cust_no=ext.their_cust_no),
you can do this:
(select their_cust_no from cust where our_cust_no=ERROR_RECORD.cust_no),
you appear to have a lookup table that goes back and forth no????
<quote>
yes, but as I explained, (select our_cust_no from cust where their_cust_no=ext.their_cust_no), returns no rows (null in the scalar subquery) because their_cust_no is a BAD number, NOT in my lookup table.
So, error_record.cust_no is NULL, I cant use it to translate back, hence my comment about ext.cust_no being "lost"
July 16, 2004 - 10:41 am UTC
well, guess you would have to fetch it out in the "bulk collect" -- and since you are thinking of moving into 3 arrays anyhow? just don't move that field into the 3 arrays. you need to split them out anyway right?
How does the access driver work?
A reader, July 15, 2004 - 9:29 pm UTC
These might be stupid questions, but please indulge me...
1. If I 'select * ext_table where rownum<=10' does it just load the first 10 records in the file? (similar to sqlldr's load=10 option)
2. Or is the entire file loaded every time the ext_table is referenced in any manner in any DML?
3. To change the file location, I can simply do
alter table ext_table location ('new_filename');
But to change from LOGFILE to NOLOGFILE, I need to specify the entire 'access parameters' section! This is a pain.
From a design perspective, do you know why the logfile clause is treated any differently than the 'location' clause?
Thanks
July 16, 2004 - 10:42 am UTC
1) yes.
2) it stops
3) it seems you would either
a) log it
b) not log it?
I suppose - logfile is a small part of a larger clause is the technical reason I suppose, but it seems you would either want to log it or not always.
Reply to Reader with translation issue
Bill, July 16, 2004 - 8:29 am UTC
<quote>
umm, if you can do this:
(select our_cust_no from cust where their_cust_no=ext.their_cust_no),
you can do this:
(select their_cust_no from cust where our_cust_no=ERROR_RECORD.cust_no),
you appear to have a lookup table that goes back and forth no????
<quote>
yes, but as I explained, (select our_cust_no from cust where
their_cust_no=ext.their_cust_no), returns no rows (null in the scalar subquery)
because their_cust_no is a BAD number, NOT in my lookup table.
So, error_record.cust_no is NULL, I cant use it to translate back, hence my
comment about ext.cust_no being "lost"
===================================================
Seems obvious to me, but:
why not put a holding variable in your code, and store your their_cust_no in there until you are finished error processing? This way, you will not lose it - just set the holding variable every time you loop through a new record.
July 16, 2004 - 11:21 am UTC
won't work -- we are using bulk operations -- not slow by slow processing.
Reply to Reader with translation issue - another think
Bill, July 16, 2004 - 8:47 am UTC
Just realized that if you do:
select our_cust_no from cust where
their_cust_no=ext.their_cust_no)
and it returns no rows, why can't you just set
error_record.cust_no = their_cust_no
at this point? You already "know" what their_cust_no is - just add a check to see if error_record.cust_no is null and if it is, redirect it to their_cust_no. Make sense?
To Bill
A reader, July 16, 2004 - 9:29 am UTC
We are not on the same page, let me explain...
I have a target table I am loading from a external table. I am BULK COLLECTING into a table type based on %ROWTYPE. Look at my code above. This lets me do
select
<all the columns in my target table>
bulk collect into myrectab
from external_table;
forall i in 1..myrectab.count save exceptions
insert into target_table values myrectab(i);
Now, at this point, myrectab already contains all the data I need to insert. If few of the rows has a bad their_cust_no, the scalar subquery in my select above would have stored NULL in in that record in myrectab. So when I FORALL INSERT into my table, I would get a 'not null constraint violated' error. At this point, I need to log the their_cust_no *for that record* into an error table. But since I translated it to NULL, the their_cust_no is lost!
"why not put a holding variable in your code, and store your their_cust_no in there until you are finished error processing? This way, you will not lose it -
just set the holding variable every time you loop through a new record"
Right, all I need to do is select both the their_cust_no and the translated our_cust_no, but then I cant bulk collect into a ROWTYPE collection. i.e. it would be like
select
<columns needed for my target_table>,
ext.their_cust_no
bulk collect into XXX
from external_table ext;
But the XXX above has to be a ROWTYPE collection based on my target_table otherwise I cant use it in my FORALL to insert the row.
To Reader - OOPS! I missed that, sorry!
Bill, July 16, 2004 - 10:10 am UTC
Did not catch the "has to be a %ROWTYPE collection" part - sorry about that! I must learn to READ!
How does the access driver work?
A reader, July 16, 2004 - 11:03 am UTC
1. OK so if I have a rownum<=10 against the external table, it will load only the first 10 records.
But if I access the external table in a complicated query using few other tables and joins and WHERE clauses and stuff, it *has* to load the entire table in memory or temp to satisfy the query, right?
In other words, from a EXPLAIN PLAN perspective, what are the various access paths that Oracle has for a external table? The only one I see after I analyze the external table is EXTERNAL TABLE ACCESS (FULL). Are there others?
2. Regarding the LOGFILE clause, I was thinking that I would set it to NOLOGFILE in the table DDL assuming that most times the file is fine, no errors, so there is no benefit in having a logfile generated and clutter up my directory. But when there is a problem, I can just ALTER TABLE LOGFILE and re-run the code to have a logfile generated. Sort of like turning on tracing to diagnose a problem.
Thanks
July 16, 2004 - 1:52 pm UTC
1) no, it all depends. it all depends. it might, it might not.
table access full is pretty much it for external tables -- it does not mean however that it has to read the entire thing up front for any query. just means we'll be reading it row by row at some point -- but we can always stop.
2) then you'll be using the entire access parameter - set it up in a stored procedure to make it easier (that's what sp's are really good at)
Bulk collect into multiple collections
A reader, July 16, 2004 - 11:16 am UTC
Here is what I want to do:
type rec_t is record
(
emp_rec emp%rowtype,
extra_column varchar2(100)
);
type mytype is table of rec_t index by binary integer;
mytab mytype;
select
a.*,
'zzz'
bulk collect into mytab
from emp a;
forall i in 1..mytab.count
insert into emp_copy values mytab(i).emp_rec;
This gives me an error about not being able to mix single-row and multi-row in a bulk collect.
How can I split this into 2 collections and still use the rowtype and bulk collect? (I want to use the rowtype so that I can insert the entire record instead of specifying each field)
Thanks
July 16, 2004 - 1:55 pm UTC
you cannot -- as i said, you'd have to move it into another array -- which seems to be what you would want to do anyhow
a) array 1 to be udpated
b) array 2 to be inserted
c) array 3 to be error logged.
Splitting into 3 arrays
A reader, July 16, 2004 - 2:45 pm UTC
Um, sorry, I was trying to figure this out myself but wasnt quite able to modify my existing code (posted earlier) to split into the 3 arrays you suggest (and still maintain my %rowtype construct)
Could you please get me started on how to split into these 3 arrays?
Thanks
July 16, 2004 - 10:52 pm UTC
you would have 4 arrays - one you fetch into, one you insert with, one you update with and one you log errors with.
they can all be different.
write three stored procedures to assign a fetched row into the other types and your code will be "nice looking"
How?
A reader, July 17, 2004 - 12:29 am UTC
Um, thanks, but can you post some sample code to illustrate what you suggest?
"you would have 4 arrays - one you fetch into, one you insert with, one you update with and one you log errors with"
so, I get all my rows bulk collect into one array. Now what? how can I split them into the other 3 arrays #b and maintain bulk processing?#b
"write three stored procedures to assign a fetched row into the other types and your code will be "nice looking""
Not sure what you mean by nice looking?
July 17, 2004 - 2:34 pm UTC
you are procedurally processing data here - it is called "assignment", you assign the elements of the array you fetched into -- into one of three arrays. then, after you fill them up -- you execute three bulk statements.
by nice looking, I mean readable and terse:
if ( inserting ) then
insert_array(insert_array.count+1) :=
make_me_an_insert_record( bulk_fetched_record(i) );
elsif ( updating ) then
update_array(update_array.count+1) :=
make_me_an_update_record( bulk_fetched_record(i) );
else
bad_array(bad_......)
A reader, July 30, 2004 - 10:38 am UTC
<quote>
2) then you'll be using the entire access parameter - set it up in a stored procedure to make it easier (that's what sp's are really good at)
<quote>
Hm, I just realized the ACCESS_PARAMETERS is a CLOB in the dictionary. Can you help me with writing a SP to change the LOGFILE parameter and keep the rest unchanged? The spec for the sp would be
my_pkg.alter_external_table(table_name,file_type,dir,name)
file_type would be log, bad, or disc or NONE (to put the NOLOGFILE)
I havent worked with DBMS_LOB much, so any help would be much appreciated.
Since it is a free-form text, the sp would need to parse out
LOGFILE <bunch of whitespace including newline> ....
and stuff.
Thanks for your help
July 30, 2004 - 5:28 pm UTC
not a chance.
you do not touch the data dictionary -- ever.
A reader, July 30, 2004 - 7:56 pm UTC
No, no, you misunderstand. I was just trying to implement your suggestion to replace the entire access parameters clause.
i.e. read in the user_external_tables.access_parameters clob, parse out the logfile clause, write the new logfile clause and write the entire access parameters clause back using the regular ALTER TABLE ... ACCESS PARAMETERS clause.
All this just because the logfile clause is not available via a simple ALTER TABLE like the LOCATION clause is.
Help?
Thanks
July 31, 2004 - 10:45 am UTC
Ok, just read the lob out into a plsql variable of type "long". Unless you make it bigger than 32k (doubtful), it'll just be a "string"
is
l_variable long;
...
begin
....
l_variable := dbms_lob.substr( l_clob_value, 32765, 1 );
....
then use substr, instr, etc to "parse" it and do your replacement.
how to load this data
Pukala, August 03, 2004 - 5:24 am UTC
Hi
I have this date
200308;1
200405;2
I then create this table
create table go$external
(a date,
b number)
organization external
(type oracle_loader
default directory prueba
access parameters
( records delimited by newline
badfile 'go.bad'
logfile 'go.log'
fields terminated by ';'
(a char(6) date_format date mask "yyyymm",
b)
)
location ('go.txt')
)
reject limit unlimited;
the problem is since the second filed hasnt got ";" I get this error
error processing column B in row 1 for datafile /tmp/go.txt
ORA-01722: invalid number
how can I overcome this?
August 03, 2004 - 8:49 am UTC
you'll have to show me how to reproduce step by step -- for the example you gave works "out of the box"
ops$tkyte@ORA9IR2> create or replace directory prueba as '/tmp'
2 /
Directory created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> drop table go$external;
Table dropped.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table go$external
2 (a date,
3 b number)
4 organization external
5 (type oracle_loader
6 default directory prueba
7 access parameters
8 ( records delimited by newline
9 badfile 'go.bad'
10 logfile 'go.log'
11 fields terminated by ';'
12 (a char(6) date_format date mask "yyyymm",
13 b)
14 )
15 location ('go.txt')
16 )
17 reject limit unlimited;
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> !cat /tmp/go.txt
200308;1
200405;2
ops$tkyte@ORA9IR2> select * from go$external;
A B
--------- ----------
01-AUG-03 1
01-MAY-04 2
overwrite the logfile
A reader, August 03, 2004 - 5:33 am UTC
Everytime we query the external table the logfile is appended, is there a way to overwrite it? Cant find anything in the docs!
Cheers
August 03, 2004 - 8:52 am UTC
<quote>
The LOGFILE clause names the file that contains messages generated by the external tables utility while it was accessing data in the datafile. If a log file already exists by the same name, the access driver reopens that log file and appends new log information to the end. This is different from bad files and discard files, which overwrite any existing file. NOLOGFILE is used to prevent creation of a log file.
</quote>
that is the way it works by design, you'd have to use utl_file to "erase it" first if this was not desired.
Accessing logfile from client
va, August 06, 2004 - 11:25 pm UTC
The loader process creates the logfile/badfile on the server (Unix). How can developers read these from the client side (Windows)? They dont have Unix user accounts.
Help? Thanks
August 07, 2004 - 10:08 am UTC
create YAET
yet another external table
just liks external_clntques_bad demonstrated above.
CLOB
A reader, August 07, 2004 - 11:23 am UTC
Instead of doing
ops$ora920@ORA920.US.ORACLE.COM> create table external_clntques_bad
2 ( text1 varchar2(4000) ,
3 text2 varchar2(4000) ,
4 text3 varchar2(4000)
5 )
6 organization external
...
Can I just load the file into a CLOB column and be done with it instead of these text1, text2, text3 columns?
Thanks
August 07, 2004 - 1:47 pm UTC
ops$tkyte@ORA9IR2> create table external_clntques_bad
2 ( txt clob
3 )
4 organization external
5 (type oracle_loader
6 default directory external_tables_dir
7 access parameters
8 (
9 records delimited by newline
10 fields
11 missing field values are null
12 ( txt postition(1:1000000)
13 )
14 )
15 location ('foobar.bad')
16 )
17 /
( txt clob
*
ERROR at line 2:
ORA-30656: column type not supported on external organized table
Concurrent access
A reader, August 11, 2004 - 6:56 am UTC
1. How does Oracle handle concurrent access to external tables? Suppose 2 sessions do a 'select' from the external table, they dont interfere with each other in any way, right?
2. Suppose one session is in the middle of a select * from a million row file, other session comes along and does a ALTER TABLE LOCATION ('newfile.dat'). Would the first session get confused or would the second session block?
Thanks
August 11, 2004 - 10:13 am UTC
1) they are read only -- other than "io will be contended for" they do not "bother" eachother
2) once a plan is in place, a plan is in place and nothing is going to alter the course of the guy executing that plan. no confusion.
Concurrent access
A reader, August 11, 2004 - 11:17 am UTC
"once a plan is in place, a plan is in place and nothing is going to alter the course of the guy executing that plan. no confusion"
Right, I dont doubt that the first session will continue to keep fetching the 1 million rows.
But when this is going on, would the second sessions
ALTER TABLE EXT_TABLE LOCATION('new.dat')
succeed?
What if the first session has paused the fetching using 'sqlplus' pause or something? And resumes fetching after the LOCATION has changed? Would the first session error out because it is now pointing to a different file?
August 11, 2004 - 1:33 pm UTC
the first session has a plan -- plan is "read this file foo.dat". that plan is not affected by an alter -- the plan is fixed in place. no matter what happens -- plan will stay "read this file foo.dat" until the query completes.
ops$tkyte@ORA9IR2> create or replace directory data_dir as '/tmp/'
2 /
Directory created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table external_table
2 (EMPNO NUMBER(4) ,
3 ENAME VARCHAR2(10),
4 JOB VARCHAR2(9),
5 MGR NUMBER(4),
6 HIREDATE DATE,
7 SAL NUMBER(7, 2),
8 COMM NUMBER(7, 2),
9 DEPTNO NUMBER(2)
10 )
11 ORGANIZATION EXTERNAL
12 ( type oracle_loader
13 default directory data_dir
14 access parameters
15 ( fields terminated by ',' )
16 location ('emp.dat')
17 )
18 /
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> host flat scott/tiger emp > /tmp/emp.dat
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> variable x refcursor
ops$tkyte@ORA9IR2> variable y refcursor
ops$tkyte@ORA9IR2> exec open :x for select empno from external_table;
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> alter table external_table location( 'new.dat' );
Table altered.
ops$tkyte@ORA9IR2> exec open :y for select empno from external_table;
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> print x
EMPNO
----------
7369
7499
7521
7566
7654
7698
7782
7788
7839
7844
7876
7900
7902
7934
14 rows selected.
ops$tkyte@ORA9IR2> print y
ERROR:
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-04040: file new.dat in DATA_DIR not found
ORA-06512: at "SYS.ORACLE_LOADER", line 14
ORA-06512: at line 1
no rows selected
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> drop table external_table;
Table dropped.
Log file directory - Oracle 9.2.0.4
Raja, August 12, 2004 - 3:01 pm UTC
Hi Tom,
Just wanted to know if a seperate logfile directory could be specified in the external table DDL and how. Else, is there any other way to get around this.
Thanks,
Raja
August 12, 2004 - 3:28 pm UTC
organization external
(type oracle_loader
default directory my_dir
access parameters
(records delimited by newline
badfile my_bad_dir:'ext_vende%a_%p.bad'
logfile my_log_dir:'ext_vende%a_%p.log'
fields terminated by ','
missing field values are null
(
you can specify the directory, yes.
Log file directory - Oracle 9.2.0.4
Raja, August 12, 2004 - 4:05 pm UTC
Thanks Tom it works!!
array processing external files
A reader, September 19, 2004 - 3:00 pm UTC
Hi
We have databases in Oracle 8i and 9i. We process text files daily with PRO*C. With 9i this is easy we make use of external tables however in 8i we process the text files line by line and it's not as fast.
My question is, is it possible process array of lines with PRO*C and text files?
September 19, 2004 - 3:15 pm UTC
sure, C is C is C...
in C you can fread as much data as you want -- you have infinite control.
Not sure, since Oracle would not even be involved in the reading of an OS file via Pro*C directly -- it is all about C at that point, what else to say...
fread vs fget
A reader, September 19, 2004 - 4:12 pm UTC
Hi
I am afraid we arent using fread since the text file has variable line length so we must use fgets. It's hard to modify the text format since these files come from other companies.
I guess this is not related to Oracle as you say. However do you know if it's possible read arrays of line using fget?
Thx a lot
September 19, 2004 - 5:09 pm UTC
fread plus strchr can do wonders.........
parsing a line in C is simple.
just read a big chunk, and strchr looking for newlines
Spaces not getting loaded
DAP, September 23, 2004 - 6:21 pm UTC
I'm trying to load a space from an external file into a table, but the space is getting treated as a null. Is there an access parameter to make this work?
DATAFILE:
--------
abcde fghi
TABLE DEFINITION:
----------------
SQL> desc a
Name Null? Type
----------------------------------------- -------- ----------------------------
A NOT NULL VARCHAR2(5)
B NOT NULL VARCHAR2(1)
C NOT NULL VARCHAR2(4)
EXTERNAL FILE DEFINITION:
------------------------
drop table a_load;
create table a_load
(a varchar2(5),
b varchar2(1),
c varchar2(4))
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_dir
ACCESS PARAMETERS (RECORDS delimited by newline
FIELDS
(a char(5),
b char(1),
c char(4)))
LOCATION ('a')) nologging;
insert into a
(a,
b,
c)
(SELECT
a,
b,
c
FROM a_LOAD);
COMMIT;
insert into a
*
ERROR at line 1:
ORA-01400: cannot insert NULL into ("CPUT"."A"."B")
September 24, 2004 - 10:04 am UTC
ops$tkyte@ORA9IR2> create table a_load
2 (a varchar2(5),
3 b varchar2(1),
4 c varchar2(4))
5 ORGANIZATION EXTERNAL
6 (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_dir
7 ACCESS PARAMETERS (RECORDS delimited by newline
8 FIELDS <b>NOTRIM</b>
9 (
10 a char(5),
11 b char(1),
12 c char(4)))
13 LOCATION ('a')) nologging;
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select '"'||a||'"', '"'||b||'"', '"'||c||'"' from a_load;
'"'||A| '"' '"'||C
------- --- ------
"abcde" " " "fghi"
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into a
2 (a,
3 b,
4 c)
5 (SELECT
6 a,
7 b,
8 c
9 FROM a_LOAD);
1 row created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> COMMIT;
Commit complete.
9.2.0.5.0 prevents external tables reading from link (hard/symbolic)
Jim Dickson, September 24, 2004 - 6:22 am UTC
Can you reproduce this problem.
Under Standard Edition 9.2.0.3.0 on Solaris 2.8 an external table could happily read a file that was a symbolic link.
This appears to be no longer possible in 9.2.0.5.0.
I cannot find any documentation to support theory that perhaps this was a security hole Oracle have plugged.
We extensively use external tables and symbolic links.
TEST CASE:
( You will have to substiture directory name/path )
Since upgrading to 9.2.0.5.0, it fails.
TEST CASE:
hh-l:/home/jdickson/External_Tables.92050->more external_tables_link.sql
spool external_tables_link.out
set echo on
show user
!rm /scv/dev/external_tables/file1.psv
!rm /scv/dev/external_tables/file2.psv
!echo "file1" > /scv/dev/external_tables/file1.psv
!echo "file2" > /scv/dev/external_tables/file2.psv
!cat /scv/dev/external_tables/file1.psv
!cat /scv/dev/external_tables/file2.psv
drop table ext_table;
create table ext_table
(
col1 varchar2(15)
)
ORGANIZATION EXTERNAL
( type oracle_loader
default directory external_tables
access parameters
(
RECORDS DELIMITED BY NEWLINE
fields terminated by ',' NOTRIM
MISSING FIELD VALUES ARE NULL
)
location ('file0.psv')
)
reject limit 0
;
alter table ext_table location ('file1.psv');
select * from ext_table;
alter table ext_table location ('file2.psv');
select * from ext_table;
!rm /scv/dev/external_tables/file2.psv
!ln /scv/dev/external_tables/file1.psv /scv/dev/external_tables/file2.psv
!cat /scv/dev/external_tables/file1.psv
!cat /scv/dev/external_tables/file2.psv
select * from ext_table;
spool off
exit;
RESULTS:
RESULTS:
jdickson@DV01> show user
USER is "JDICKSON"
jdickson@DV01>
jdickson@DV01> !rm /scv/dev/external_tables/file1.psv
jdickson@DV01> !rm /scv/dev/external_tables/file2.psv
jdickson@DV01>
jdickson@DV01> !echo "file1" > /scv/dev/external_tables/file1.psv
jdickson@DV01> !echo "file2" > /scv/dev/external_tables/file2.psv
jdickson@DV01> !cat /scv/dev/external_tables/file1.psv
jdickson@DV01> !cat /scv/dev/external_tables/file2.psv
jdickson@DV01>
jdickson@DV01> drop table ext_table;
Table dropped.
jdickson@DV01> create table ext_table
2 (
3 col1 varchar2(15)
4 )
5 ORGANIZATION EXTERNAL
6 ( type oracle_loader
7 default directory external_tables
8 access parameters
9 (
10 RECORDS DELIMITED BY NEWLINE
11 fields terminated by ',' NOTRIM
12 MISSING FIELD VALUES ARE NULL
13 )
14 location ('file0.psv')
15 )
16 reject limit 0
17 ;
Table created.
jdickson@DV01>
jdickson@DV01> alter table ext_table location ('file1.psv');
Table altered.
jdickson@DV01> select * from ext_table;
COL1
---------------
file1
jdickson@DV01>
jdickson@DV01> alter table ext_table location ('file2.psv');
Table altered.
jdickson@DV01> select * from ext_table;
COL1
---------------
file2
jdickson@DV01>
jdickson@DV01> !rm /scv/dev/external_tables/file2.psv
jdickson@DV01> !ln /scv/dev/external_tables/file1.psv /scv/dev/external_tables/file2.psv
jdickson@DV01> !cat /scv/dev/external_tables/file1.psv
jdickson@DV01> !cat /scv/dev/external_tables/file2.psv
jdickson@DV01>
jdickson@DV01>
jdickson@DV01> select * from ext_table;
select * from ext_table
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-29400: data cartridge error
KUP-04001: error opening file /scv/dev/external_tables/file2.psv
ORA-06512: at "SYS.ORACLE_LOADER", line 14
ORA-06512: at line 1
jdickson@DV01>
jdickson@DV01> spool off
PLUS
hh-l:/home/jdickson/External_Tables.92050->tail /scv/dev/external_tables/EXT_TABLE_19785.log
Record format DELIMITED BY NEWLINE
Data in file has same endianness as the platform
Rows with all null fields are accepted
Fields in Data Source:
COL1 CHAR (255)
Terminated by ","
KUP-04001: error opening file /scv/dev/external_tables/file2.psv
KUP-04017: OS message: Permission denied
September 24, 2004 - 11:14 am UTC
confirmed in 10g and 9205 -- not sure if this was the expected outcome or not. You'll want to contact support and ask them about:
BugTag: Support notes on <Bug:2290323> - DDR info <BugDesc:2290323>
Affects: RDBMS (90-A0)
NB: FIXED
Abstract: EXTERNAL TABLE parameters should disallow symbollic links
Fixed-Releases: 9202 A000
Tags: EXTTAB
Details:
EXTERNAL TABLE parameters should disallow symbollic links
(for security reasons).
Note: This fix is modified slightly in 9205
the fix was for files written by external tables in 9202 (log, bad, etc) -- the 'modification' in 9205 seems to be affecting a read, but i didn't see anymore information than that. They'll have to research it a little to see if this is an unexpected behaviour or the behaviour going forward.
External tables are awesome
Scott, October 10, 2004 - 11:11 pm UTC
G'day Tom.
I created my first external table today, and it worked first time (well, almost, after I defined one of my columns to the correct width...)
Great feature, and thanks for all these examples.
Skip Column - Filler in External Tables
Jan, October 21, 2004 - 5:55 am UTC
How to skip a column if it has variable size but delimited by e.g. ',' (without using PL/SQL or such tricks)?
Thanks
October 21, 2004 - 7:02 am UTC
in Loader yes, but in External Tables?
Jan, October 21, 2004 - 8:16 am UTC
FILLER seems to work for Loader only, or I have somewhere
a syntax error...
Please correct me where I made mistake.
i have a text file with 3 columns but in my ext. table I want to have just 2-nd and 3-th columns.
Thanks.
CREATE TABLE T_EXT
(
col2 VARCHAR2(10),
col3 VARCHAR2(10)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY MY_DIR
ACCESS PARAMETERS
( FIELDS TERMINATED BY '|' RTRIM
MISSING FIELD VALUES ARE NULL (col1 FILLER CHAR, col2, col3)
)
LOCATION (MY_DIR:text.txt')
)
October 21, 2004 - 2:24 pm UTC
oh, didn't see EXTERNAL table in the 'subject', but isn't the answer sort of obvious?
instead of select * from et;
select a, c from et;
to skip b? seems "intuitive"?
but, if you have a control file:
LOAD DATA
INFILE *
INTO TABLE t
replace
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
(
a ,
b filler,
c
)
all you need to do is run sqlldr ..... external_table=generate_only to see what you would need to do.
Is SQL*Loader My Only Choice for Normalization?
Doug Wingate, October 26, 2004 - 7:48 pm UTC
Tom,
I've been asked to extract records from a legacy application implemented in VSAM and written in COBOL and to load the information into an Oracle schema (as the first step to construction of a new data warehouse). One of the dubious features of COBOL is that it encourages storage of unnormalized information by allowing one to implement a one-to-many relationship as a record description containing a mixture of nonrepeating and repeating fields or groups of fields. My plan for the numerous extractions and loads I have to do was to extract whole, intact records, FTP them to the Oracle data warehouse server, and use external tables as the means to read the raw extract files. However, although I think I know how to load the files into multiple tables in a normalized way using SQL*Loader, I haven't been able to find syntax for the access parameters of external tables that would allow me to create normalized external tables on top of the unnormalized external files.
In case that description isn't clear, here is a high-level description of a record type of the kind I want to be able to read by means of multiple external tables. Each record contains
(1) several fields forming a composite key to the record,
(2) several nonrepeating fields,
(3) a group of fields (a, b, c) that repeat a fixed number of times as (a1, b1, c1, a2, b2, c2, a3, b3, c3), and
(4) several additional nonrepeating fields.
Each record type and all of its parts are physically fixed length. Variation in the number of logical occurrences in (3) is realized physically as zeroed or blanked fields in the trailing physical occurrences.
As I said, I've been asked to normalize the information when I load it. As the first step, I would like to be able to read each extract file via multiple external tables that give each unnormalized external file the appearance of being multiple normalized tables. In the case of the example I'm using, I would like to create external tables that display the information like this:
External Table #1
(1) composite key (2) non-repeaters (4) non-repeaters
External Table #2
(1) composite key (1b) 1 (3) a1, b1, c1
(1) composite key (1b) 2 (3) a2, b2, c2
(1) composite key (1b) 3 (3) a3, b3, c3
"Column" 1b is intended to preserve whatever information may be implicit in the physical order of the occurrences.
Describing external table #1 doesn't appear to be a problem. However, I haven't been able to find a combination of syntactic elements that will allow me to describe the relationship between an external file containing unnormalized information and an external table like the one described in #2. Do you know of external table syntax that I've overlooked and that would allow me to describe a table like #2?
Thanks as always.
Doug Wingate
October 26, 2004 - 8:37 pm UTC
can you give me an example of an input record or two or three (single char(1) fields to keep it small) and an example of what tables you want to put it into.
this sounds just like a multi-table insert.
Afterthought to Preceding Problem Description
Doug Wingate, October 26, 2004 - 7:53 pm UTC
I should have included mention of the fact that the ETL project I'm working on is being implemented in Oracle Database 10g. Thanks.
Mock-Up Example You Asked For
Doug Wingate, October 27, 2004 - 2:05 pm UTC
Here are tables intended to be modeled on the "example" I used above.
CREATE TABLE Table_for_Nonrepeating_Fields
(Key_Col_1 NUMBER(2,0),
Nonrepeating_Col_2A CHAR(2),
Nonrepeating_Col_2B CHAR(1),
Nonrepeating_Col_4A NUMBER(3,0),
Nonrepeating_Col_4B CHAR(1));
CREATE TABLE Table_for_Repeating_Groups
(Key_Col_1 NUMBER (2,0),
Occurrence_Key_Col_1B NUMBER(1,0),
Repeating_Col_3A NUMBER(2,0),
Repeating_Col_3B CHAR(1),
Repeating_Col_3C CHAR(1));
The corresponding input format looks like this:
Key_Col_1
Nonrepeating_Col_2A
Nonrepeating_Col_2B
Repeating_Col_3A
Repeating_Col_3B
Repeating_Col_3C
Repeating_Col_3A
Repeating_Col_3B
Repeating_Col_3C
Repeating_Col_3A
Repeating_Col_3B
Repeating_Col_3C
Nonrepeating_Col_4A
Nonrepeating_Col_4B
Here are a couple of sample records. For discussion, I've introduced whitespace and made all of the numbers "external" instead of packed. However, my actual records have fixed-length fields without whitespace and do contain IBM-type packed decimal numbers. Pleas note that the iteration is physically represented in the record and I have in mind to represent the physical order logically with column Occurrence_Key_Col_1B NUMBER(1,0).
12 AB A 12 A A 34 B B 56 C C 123 A
34 CD C 78 C D 90 E F 12 G H 456 C
I do understand that I could load this with a multi-table load using SQL*Loader. I could also define an unnormalized external table in which the repeating columns were represented with distinct names; then I could insert from the external table into multiple tables using a multi-table INSERT. I'm wondering if there's a way to define external tables like the two (ordinary) tables above so that the record contents appear in normal form without having to be written to new tables as a first step. In this respect, the syntax of external table access parameters doesn't seem to be as powerful as the syntax of SQL*Loader control files, but I'm hoping you know a trick or two.
I hope this is all somewhat clearer with the mock-up example.
Thanks,
Doug Wingate
October 27, 2004 - 4:17 pm UTC
set up the external table so that you can:
ops$tkyte@ORA9IR2> select * from et;
KEY NON NON REP REP REP REP REP REP REP REP REP NON NON
--- --- --- --- --- --- --- --- --- --- --- --- --- ---
12 AB A 12 A A 34 B B 56 C C 123 A
34 CD C 78 C D 90 E F 12 G H 456 C
ops$tkyte@ORA9IR2> insert all
2 into Table_for_Nonrepeating_Fields values
3 ( key_col_1, Nonrepeating_Col_2A, Nonrepeating_Col_2B, Nonrepeating_Col_4A, Nonrepeating_Col_4B )
4 into Table_for_Repeating_Groups values
5 ( key_col_1, 1, Repeating_Col_3A1, Repeating_Col_3B1, Repeating_Col_3C1 )
6 into Table_for_Repeating_Groups values
7 ( key_col_1, 2, Repeating_Col_3A2, Repeating_Col_3B2, Repeating_Col_3C2 )
8 into Table_for_Repeating_Groups values
9 ( key_col_1, 3, Repeating_Col_3A3, Repeating_Col_3B3, Repeating_Col_3C3 )
10 select * from et;
8 rows created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select * from Table_for_Nonrepeating_Fields;
KEY_COL_1 NO N NONREPEATING_COL_4A N
---------- -- - ------------------- -
12 AB A 123 A
34 CD C 456 C
ops$tkyte@ORA9IR2> select * from Table_for_Repeating_Groups;
KEY_COL_1 OCCURRENCE_KEY_COL_1B REPEATING_COL_3A R R
---------- --------------------- ---------------- - -
12 1 12 A A
34 1 78 C D
12 2 34 B B
34 2 90 E F
12 3 56 C C
34 3 12 G H
6 rows selected.
Thanks
Doug Wingate, October 27, 2004 - 8:11 pm UTC
All right, thanks, Tom. I was hoping there was a way to overlay normalized external tables on unnormalized external files. However, even though it seems there's not, I can see that by using external tables such as the one you've made (and leaving normalization to be accomplished in the process of insertion), I can at least accomplish two other goals: (1) I can provide humanly-readable views of the EBCDIC, packed contents of the external files prior to their insertion into the database (by letting the Oracle_Loader access driver make the characterset and numeric datatype conversions) and (2) I can internalize the load process to the instance instead of using a utility. By themselves, those still seem like worthy reasons to use external tables.
Load only different records
A reader, November 04, 2004 - 4:00 pm UTC
I have a table like
create table cust_hist(cust_id int,modified_dt date,i int,j int,k int);
i.e. all modifications of the cust_id are timestamped and stored in the table.
These changes come from a vendor-supplied external file. I have create a external table based on this file.
Question:
How can I load only records from the file/table that are different than the latest (based on modified_dt) record for that cust_id? Of course, if the cust_id doesnt exist in my table, load it in.
I tried something like
select * from ext_Table
where (...) not in (select * from (select max(modified_dt) over (partition by cust_id) max_dt,a.* from cust_hist a) where modified_dt=max_dt)
but nulls are creating havoc?
Thanks
November 05, 2004 - 3:05 pm UTC
ops$tkyte@ORA9IR2> create table cust_hist(cust_id int,modified_dt date,i int,j int,k int);
Table created.
ops$tkyte@ORA9IR2> create table et(cust_id int,modified_dt date,i int,j int,k int);
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into et
2 select rownum, created, user_id, user_id, user_id
3 from all_users;
20 rows created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> merge into cust_hist
2 using ( select et.*
3 from et left OUTER join cust_hist on (et.cust_id = cust_hist.cust_id)
4 where et.modified_dt > nvl(cust_hist.modified_dt,et.modified_dt-1) ) et
5 on (cust_hist.cust_id = et.cust_id)
6 when matched then update set modified_dt = et.modified_dt, i = et.i, j = et.j, k = et.k
7 when not matched then insert values ( et.cust_id, et.modified_dt, et.i, et.j, et.k )
8 /
20 rows merged.
ops$tkyte@ORA9IR2> /
0 rows merged.
ops$tkyte@ORA9IR2> update et set modified_dt = sysdate, i = 55 where cust_id = 5;
1 row updated.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select * from cust_hist where cust_id = 5;
CUST_ID MODIFIED_ I J K
---------- --------- ---------- ---------- ----------
5 24-SEP-04 49 49 49
ops$tkyte@ORA9IR2> merge into cust_hist
2 using ( select et.*
3 from et left OUTER join cust_hist on (et.cust_id = cust_hist.cust_id)
4 where et.modified_dt > nvl(cust_hist.modified_dt,et.modified_dt-1) ) et
5 on (cust_hist.cust_id = et.cust_id)
6 when matched then update set modified_dt = et.modified_dt, i = et.i, j = et.j, k = et.k
7 when not matched then insert values ( et.cust_id, et.modified_dt, et.i, et.j, et.k )
8 /
1 row merged.
ops$tkyte@ORA9IR2> select * from cust_hist where cust_id = 5;
CUST_ID MODIFIED_ I J K
---------- --------- ---------- ---------- ----------
5 05-NOV-04 55 49 49
ops$tkyte@ORA9IR2> insert into et values ( 100, sysdate, 1,2,3 );
1 row created.
ops$tkyte@ORA9IR2> select * from cust_hist where cust_id = 100;
no rows selected
ops$tkyte@ORA9IR2> merge into cust_hist
2 using ( select et.*
3 from et left OUTER join cust_hist on (et.cust_id = cust_hist.cust_id)
4 where et.modified_dt > nvl(cust_hist.modified_dt,et.modified_dt-1) ) et
5 on (cust_hist.cust_id = et.cust_id)
6 when matched then update set modified_dt = et.modified_dt, i = et.i, j = et.j, k = et.k
7 when not matched then insert values ( et.cust_id, et.modified_dt, et.i, et.j, et.k )
8 /
1 row merged.
ops$tkyte@ORA9IR2> select * from cust_hist where cust_id = 100;
CUST_ID MODIFIED_ I J K
---------- --------- ---------- ---------- ----------
100 05-NOV-04 1 2 3
A reader, November 05, 2004 - 3:36 pm UTC
Um, not exactly.
update et set modified_dt=sysdate,k=999 where cust_id=100;
Now when I run your MERGE, it updates the cust_id=100 row. I want it to create a NEW row with cust_id=100 and the new modified_dt (thats the reason for having the modified_dt, I dont want to lose any modifications to the cust_id but I dont want to keep storing stuff if nothing changed!)
So, I never want to UPDATE my cust_hist table. I only want to keep appending new rows to the table if the row in the external table is different than the latest row (based on modified_dt) for that cust_id
Help? Thanks
November 05, 2004 - 5:48 pm UTC
the problem description was not "very clear".
insert into cust_hist
select * from et
where (cust_id, modified_dt) not in ( select cust_id, modified_dt from cust_hist);
period, done.
cust_id and modified_dt should BOTH be not null -- no null issues.
Not quite
A reader, November 05, 2004 - 6:06 pm UTC
Um, not quite again.
"insert into cust_hist
select * from et
where (cust_id, modified_dt) not in ( select cust_id, modified_dt from cust_hist);"
I dont know how to communicate this better. My vendor is giving me a "dupe" with only the modified_dt different! They insist that they cannot prevent this since they are giving us only a subset of columns and if some other column on the record changes, they ship it to us. But as far as we are concerned, it is a dupe.
In our example, if I have
cust=id=1,modified_dt=1/1/2004,i=1,j=1,k=1
They send me
cust=id=1,modified_dt=1/2/2004,i=1,j=1,k=1
Your SQL above will load this. But I dont want to load this since the content of the record (i/j/k) is the same as the latest record I already have for that cust_id.
I would want to load the following records
cust=id=1,modified_dt=1/2/2004,i=1,j=1,k=null
cust=id=1,modified_dt=1/2/2004,i=1,j=1,k=2
i.e. a different modified_dt (most likely in the future of what I already have, but it could be in the past too). More importantly, the "data" should be "different"!
Help? Thanks
November 05, 2004 - 6:29 pm UTC
ahh, thats more informative.
select cust_id, i, j, k from ET
MINUS
select cust_id, i, j, k from cust_hist
shows the records to load (nulls are taken care of)
insert into cust_hist
select * from et
where cust_id in (select cust_id from (THAT_QUERY) );
if you need to get their modified date (unfortunately, we'd have to read that ET twice here)
if you can get away with just loading the CURRENT sysdate as the modified dt, you can:
insert into cust_hist
select x.*, sysdate
from (THAT_QUERY);
Brilliant!
A reader, November 05, 2004 - 8:23 pm UTC
Thanks! Nah, the external table is really small, so reading it twice is not a problem
Thanks again
A reader, November 05, 2004 - 9:33 pm UTC
Hm, wait a minute
select cust_id, i, j, k from ET
MINUS
select cust_id, i, j, k from cust_hist
will give me all the records in the file that are different than any record I already have. I want the record from the file only if it is different than the latest record I have for that cust_id?
November 06, 2004 - 10:33 am UTC
this is the strangest "delta" file I've ever heard of then.
you have to join and compare
select et.*
from et, (select *
from (select cust_hist,
max(modified_dt) over (partition by cust_id) maxdt
from cust_hist)
where modified_dt = maxdt) cust_hist
where et.cust_id = cust_hist.cust_id(+)
and (et.modified_dt > cust_hist.modified_dt or cust_hist.cust_id is null)
and ((cust_hist.cust_id is null) OR
(et.i <> cust_hist.i OR (et.i is null and cust_hist.i is not null) OR
(et.i is not null and cust_hist.i is null)) OR
(et.j <> cust_hist.j OR (et.j is null and cust_hist.j is not null) OR
(et.j is not null and cust_hist.j is null)) OR
.......
)
Outer join ET to CUST_HIST.
Keep the records where et.modified_dt exceeds the cust_hist.modified_dt OR the cust_hist.cust_id is NULL (new record).
Then -- look to see if this is a new record OR i's differ OR j's differ and so on....
Deltafile
A reader, November 06, 2004 - 2:51 pm UTC
I already told you the reason...
My vendor is giving me a "dupe" with only the modified_dt different! They insist that they cannot prevent this since
they are giving us only a subset of columns and if some other column on the record changes, they ship it to us. But as far as we are concerned, it is a dupe. So I want to load the record only if something is different than the last version of the record I have!
Its not that strange?
I was hoping that query wouldnt be hairy (there are 50 columns), but I guess there is no other way
Thanks
A reader, November 06, 2004 - 3:58 pm UTC
Instead, cant I do
select cust_id, i, j, k from ET
MINUS
select cust_id, i, j, k from
(select max(modified_dt) over (partition by cust_id) max_dt,
a.* from cust_hist a)
where max_dt=modified_dt
insert into cust_hist
select * from et
where cust_id in (select cust_id from (THAT_QUERY));
I like the way MINUS takes care of all the nulls for me, doing (a is null and b is not null ...) for 50 columns is a little painful?
Would this work? Thanks
November 06, 2004 - 6:03 pm UTC
no, for the same reason the simple minus wont work (as far as I can tell..)
A reader, November 07, 2004 - 8:18 pm UTC
"no, for the same reason the simple minus wont work (as far as I can tell..)"
Um, I am a little lost with all the back and forth we have done on this.
What is the reason this wont work?
November 08, 2004 - 9:44 am UTC
You said:
<quote>
Hm, wait a minute
select cust_id, i, j, k from ET
MINUS
select cust_id, i, j, k from cust_hist
will give me all the records in the file that are different than any record
I already have. I want the record from the file only if it is different than the
latest record I have for that cust_id?
</quote>
Well:
select cust_id, i, j, k from ET
MINUS
select cust_id, i, j, k from
(select max(modified_dt) over (partition by cust_id) max_dt,
a.* from cust_hist a)
where max_dt=modified_dt
will give you even MORE records -- so, if the first one gave you too many, this'll give you even "more"
btw: should have mentioned
where (a.c <> b.c OR (a.c is null and b.c is not null) OR (a.c is not null and b.c is null))
can be expressed as:
where decode( a.c, b.c, 1, 0 ) = 0;
decode treats NULLs as the same, so that is like saying:
if ( a.c = b.c /* treating nulls as the same */ )
then
return 1
else
return 0
end if
A reader, November 08, 2004 - 9:48 am UTC
select cust_id, i, j, k from ET
MINUS
select cust_id, i, j, k from
(select max(modified_dt) over (partition by cust_id) max_dt,
a.* from cust_hist a)
where max_dt=modified_dt
will give you even MORE records -- so, if the first one gave you too many, this'll give you even "more"
Maybe I need my first cup of coffee, but I dont understand why this would give me more records than the first one? I am just selecting the latest record for each cust_id, and minusing that out from all the records? This would give me all the records in my file that are different than the latest record for that cust_id. Exactly what I wanted? Your MINUS would give me records which are different than ANY record for that cust_id?
November 08, 2004 - 4:52 pm UTC
if this:
select cust_id, i, j, k from ET
MINUS
select cust_id, i, j, k from cust_hist
gives you more than you want, that one (which selects LESS records from cust_hist) gives at least as many records from ET if not more (since you are selecting LESS records from cust_hist, you are subtracting at least as many, if not MORE records from ET)
A reader, November 08, 2004 - 4:57 pm UTC
But its not about which query gives me more or less records. It is about which one is more accurate, given my requirements?
As I said, my version of the MINUS would give me all
the records in my file that are different than the latest record for that cust_id. Exactly what I wanted? Your MINUS would give me records which are different than ANY record for
that cust_id?
What am I missing? Thanks
November 08, 2004 - 8:35 pm UTC
you want "older records"?
what if in the ET, there is a record for 01-jan-2004
and the current record in your cust_hist is for 04-jan-2004
do you want that record from the ET?
(i see what you are saying now)
A reader, November 08, 2004 - 9:31 pm UTC
"what if in the ET, there is a record for 01-jan-2004
and the current record in your cust_hist is for 04-jan-2004
do you want that record from the ET?"
Yes
As I said earlier, ..."i.e. a different modified_dt (most likely in the future of what I already have, but it could be in the past too). More importantly, the "data" should be
"different"!
So, given all this, is my version of the MINUS query the best way to do this?
Thanks
November 08, 2004 - 9:38 pm UTC
but your minus does not filter out "old different" data -- which is the crux of my prior statements.
it only filters out "last modified_dt differences"
We would be back to a simple
select ... from et
minus
select ... from cust_hist
to get:
i.e. a different modified_dt (most likely in the future
of what I already have, but it could be in the past too). More importantly, the
"data" should be
"different"!
if your last modified_dt is 04-jan-2004 for cust_id = 123
And they give you a record with a modified_dt of 01-jan-2004 for cust_id = 123
and the data is different in that record
AND YOU HAVE a cust_hist record with a modified_dt of 01-jan-2004 for cust_id = 123
did you really want to update it?
A reader, November 08, 2004 - 9:48 pm UTC
Well, my cust_hist table is NEVER updated, just inserted into. It has a surrogate PK. So, yes, if they give me a cust_id=123 record dated 1/1/2004, I would just want to generate a new surrogate PK (sequence) and just stuff it in. Typically, though, I would never get a modified_dt from the vendor older than what I already have.
As I said, right now, we are inserting tons of dupes every day because all these checks are not there. I just want to minimize putting in data if nothing has changed.
So, to answer your question,
"if your last modified_dt is 04-jan-2004 for cust_id = 123
And they give you a record with a modified_dt of 01-jan-2004 for cust_id = 123
and the data is different in that record
AND YOU HAVE a cust_hist record with a modified_dt of 01-jan-2004 for cust_id = 123
did you really want to update it?"
In this case, I would just generate a new PK for the record in my cust_hist table and add it in because it is different than the latest record I have for that cust_id?
So, given all this, which would work? Your MINUS or mine? I am still not sure. Thanks
November 09, 2004 - 8:29 am UTC
well, then, do you really care about the "last modified dt"
why are you not just diffing the ET with CUST_HIST by your entire record including the modified_dt.
If you have it, skip it, if you don't, you appear to want it.
How to discard records starting with TAB or' ['?
Arun Gupta, November 11, 2004 - 11:08 am UTC
Tom,
I am working on creating an external table. I want to reject records starting with TAB character (0x09) or starting with '['. I tried LOAD WHEN, but the syntax eludes me. The records are delimited by newline and the fields are terminated by whitespace.
Thanks
November 11, 2004 - 11:16 am UTC
is column 1 part of the data on the other records?
I'd just as soon "map it"
and then query:
select * from et where column_1 not in ( chr(9), '[' );
Arun Gupta, November 11, 2004 - 11:42 pm UTC
Actually, the file I am trying to load is logs generated by a web logging program. It was very easy, records delimited by newline and fields separated by whitespaces. The correct pattern is:
Action Hostname [Timestamp] User details Other details(newline)
Action Hostname [Timestamp] User details Other details(newline)
Then I hit a road block when the logs suddenly broke the pattern. I hit upon
[Timestamp](newline)
(tab)other_detail_1(newline)
(tab)other_detail_2(newline)
(tab)other_detail_3(newline)
and then the correct pattern continues. The malformed pattern is found throughout the input file mixed with the correct pattern. I have to find a way to reject the malformed records. These records are security information and cannot be lost. They have to go into either badfile or discardfile as a whole without any modifications.
The Timestamp is enclosed in []. When Oracle finds the first record beginning with [, it puts this record into the badfile. Then it continues to load the other_detail_1, other_detail_2 and other_detail_3 into the database. This causes the malformed record to split, which is not acceptable.
One option I thought of was if somehow records beginning with [ and TAB can be rejected, the malformed records will go into the discardfile as a whole and will not be lost or modified. The condition_clause of LOAD WHEN does allow range comparison but I could not make it work. Something like
load when (1:1 != '0x09' OR 1:1 != '[') would be ideal.
Thanks...
November 12, 2004 - 7:03 am UTC
sure, you could do that or just map the first character as a column and say:
select * from et where c1 not in ( chr(9), '[' );
which is what I'm going to recommend as the path of least resistance for now :)
Another discard related question
Paul, November 15, 2004 - 3:30 am UTC
I have attempted to use the external table process as a replacement for SQL*Loader. It was progressing well until I bumped into an issue with NULLs.
I am loading the following table:
CREATE TABLE TEST_TBL
(CLAIM_ID VARCHAR2(38) NOT NULL,
DIAG_SEQUENCE NUMBER(10) NOT NULL,
POLICY_HOLDER_ID VARCHAR2(26) NOT NULL,
SUFFIX_ID VARCHAR2(4) NOT NULL,
DIAG_CD VARCHAR2(7) NOT NULL,
DISCHARGE_DATE DATE );
Notice that most of the fields are NOT NULL.
I am using the following external definition:
Create table Test_Tbl_Ext
(CLAIM_ID VARCHAR2(38),
DIAG_SEQUENCE NUMBER(10),
POLICY_HOLDER_ID VARCHAR2(26),
SUFFIX_ID VARCHAR2(4),
DIAG_CD VARCHAR2(7),
DISCHARGE_DATE DATE
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY Test_external_tables
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE
BADFILE 'Test_tbl.bad'
LOGFILE 'Test_tbl.log'
FIELDS TERMINATED BY ' '
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
( CLAIM_ID,
DIAG_SEQUENCE,
POLICY_HOLDER_ID,
SUFFIX_ID,
DIAG_CD,
DISCHARGE_DATE date mask "MM/DD/YYYY"
)
)
LOCATION ('Test_tbl.tab')
) REJECT LIMIT UNLIMITED;
As you can see, the records are tab delimited. In testing this, I use the basic "insert into test_tbl ( select * from test_tbl_ext );" syntax.
I was surprised to discover that while it behaves like SQL*Loader and rejects format, size and other data related issues, it does not handle the NOT NULL issue, it just stops with "ORA-01400: cannot insert NULL into ("TEST"."TEST_TBL"."DIAG_CD").
Is there no good way to get an EXTERNAL table to handle the NULLs properly and put them in the bad file? Or is the answer to add NOT NULL tests to every column in the SELECT statement (which seems like overkill.)
November 15, 2004 - 6:42 am UTC
umm, you TOLD it that "missing field values are null"
but in any case -- this did handle the nulls properly? having a NULL value is not an error (until you goto load it)
but yes, using "where c1 is not null" would be appropriate here.
Arun Gupta, November 15, 2004 - 10:27 am UTC
Tom,
I followed your suggested method and it works fine. However, I still wanted to experiment with LOAD WHEN and it doesn't seem to work. I used the syntax:
load when ((1:1) != '0x09' or (1:1) != '[')
As per documentation, this should put any records starting with either a TAB or '[' into the discardfile. The behaviour I am getting is that records beginning with '[' are being put in the badfile and records beginning with TAB are being loaded in the database. I have tried on 9i (Solaris), 10g (Linux). The statement is:
CREATE TABLE t_ext_sm_a
(
sm_eventid VARCHAR2(30),
sm_hostname VARCHAR2(30),
sm_timestamp VARCHAR2(30),
sm_username VARCHAR2(1000),
sm_agentname VARCHAR2(1000),
sm_status_sessid VARCHAR2(1000),
sm_reason VARCHAR2(1000),
sm_dummy1 VARCHAR2(100),
sm_dummy2 VARCHAR2(100),
sm_dummy3 VARCHAR2(100),
sm_dummy4 VARCHAR2(100)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY SM
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE
BADFILE sm:'bad_records.txt'
DISCARDFILE sm:'discard_records.txt'
NOLOGFILE
load when ((1:1) != '0x09' or (1:1) != '[')
FIELDS TERMINATED BY WHITESPACE
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
sm_eventid char(30) terminated by whitespace,
sm_hostname char(30) terminated by whitespace,
sm_timestamp char terminated by whitespace enclosed by '[' and ']',
sm_username char(1000) terminated by whitespace enclosed by '"',
sm_agentname char(1000) terminated by whitespace optionally enclosed by '"',
sm_status_sessid char(1000) terminated by whitespace optionally enclosed by '[' and ']',
sm_reason char(1000) terminated by whitespace optionally enclosed by '[' and ']',
sm_dummy1 char(100) terminated by ',',
sm_dummy2 char(100) terminated by ',',
sm_dummy3 char(100) terminated by ',',
sm_dummy4 char(100) terminated by whitespace
)
)
location ('test.log'))
reject limit unlimited;
I also tried specifying the characterset as WE8MSWIN1252 since the file was generated on Windows machine, but no success.
Thanks
Is this NOT an AND ?
Alan Stephen, November 15, 2004 - 12:06 pm UTC
I suspect that the OT should be an AND. The current expression will allow a row through if the first condition is true OR the second condtion. The '[' passes the first test, but fails the second.
November 15, 2004 - 3:49 pm UTC
very good :)
Rearranging a record
A reader, November 23, 2004 - 7:13 am UTC
I have a file with 3 record types with different layouts. I want to rearrange the layout as per a mapping and output the new records
A......
B.....
C....
The mapping is
A5-A10 -> B6-B11
A11 -> C56
and so on
How can I use external tables and a mapping table to do this? Yes, I know I cant write external tables in 9iR2, but I can write the record to a table and dump it out.
Or are Unix text processing tools like awk/perl better suited for this?
Thanks
November 23, 2004 - 7:49 am UTC
do you have a mapping table:
old new
---- ----
A5 B6
A6 B7
A7 B8
A8 B9
A9 B10
A10 B11
A11 C56
if you have that, you can
update ( select t1.c1 old_c1, t2.new new_c1
from existing_table t1, mapping_table t2
where t1.c1 = t2.old )
set old_c1 = new_c1;
A reader, November 23, 2004 - 9:19 am UTC
Guess I should have explained better, those A1, B1, etc are not data elements, they are the record type and column positions of the data in the record!
For example, the file has records like
Aasajahaldkhadlahdakldatyu
Bjfksgajasdgalashdajhaadar
I want the records rearranged like
Ahsjdsajksadshakgdadadjassa
Bahsdsajjsahakgakjshsakgads
where the mapping is as per that table i.e. record A, column position 1-5 should go to record B column position 6-10 and so on
I dont see how your update solves the problem?
November 23, 2004 - 9:35 am UTC
you have two records? just edit them -- not sure where you are going with this, I've never really heard of exchanging fields between rows like this before.
but if you have just two records, do whatever -- unless there is more "hidden" information here, doesn't seem like a problem you'd even need a computer for.
A reader, November 23, 2004 - 9:46 am UTC
I have 1000s of records in a file, but 5 record types. This is a flat-file from a vendor. They changed the layout on us and I am trying to adjust. The layout changed as per the mapping I keep talking about. Basically, they moved fields around. So, when we send them the records, I want to do the same re-arranging.
Yes, this is purely a text/file processsing problem, but I was hoping some cool exteral tables/SQL trick would do the job?
I already explained the full scope of the problem in my previous followup?
November 23, 2004 - 10:28 am UTC
not really (on the explain full scope)
is there "one" A record and "one" B record or does order matter or ......
give me a 10 record example with as much detail as possible.
A reader, November 23, 2004 - 11:37 am UTC
OK here goes
The file contains 10 records with the following format
1:1 = pk
2:2 = record type
3 onwards depends on the record type, doesnt matter what the layout is, we want to access each one as a 1-character field anyway.
The input file is
1Aabcdxyz
1Bqwertyu
1Cpopopoy
The mapping says the data in record type A, position 3-6 needs to go on a B record on positions 10-13
B4-5 -> C3-4
C8-9 -> D6-7
So, given the above input, the output should be
1B abcd (10-13)
1Cwe (3-4)
1D oy (6-7)
i.e. fill up the rhs of the mapping from the lhs and pad with spaces as necessary. Not all information from the source record might be mapped on the target record. Multiple source positions might map to the same target position, in which case the last one wins (as ordered by pk,rec_type).
Since the target could be mapped from any source record type (for that pk), I guess some sort of break-processing is needed. i.e. wait for all source records for pk=1 to be filled up and then start to map and output the target records?
Help? Thanks
November 23, 2004 - 12:42 pm UTC
How many A records, B records, C records, etc -- or is there a map PER record (which sounds unreasonable given the number of records you say there would be)
and if there are many A records, many B records -- how to you know "which" is "what"
A reader, November 23, 2004 - 11:47 am UTC
The pk for each "set" is the pk+rec_type, so for id=1, there will be only 1 A record, 1 B record, etc.
November 23, 2004 - 1:32 pm UTC
ops$tkyte@ORA9IR2> select pk || typ ||
2 case when typ = 'A' then arec
3 when typ = 'B' then rpad(nvl(substr(brec,1,9-2),' '),7) || nvl(substr(arec,3-2,4),rpad(' ',4)) || substr(brec,14-2)
4 when typ = 'C' then rpad(nvl(substr(crec,1,2-2),' '),0) || nvl(substr(brec,4-2,2),rpad(' ',2)) || substr(crec,5-2)
5 when typ = 'D' then rpad(nvl(substr(drec,1,6-2),' '),4) || nvl(substr(crec,8-2,2),rpad(' ',2)) || substr(drec,8-2)
6 end rec
7 from (
8 select pk, typ,
9 max( case when typ='A' then data end ) over (partition by pk) arec,
10 max( case when typ='B' then data end ) over (partition by pk) brec,
11 max( case when typ='C' then data end ) over (partition by pk) crec,
12 max( case when typ='D' then data end ) over (partition by pk) drec
13 from t
14 )
15 /
REC
--------------------------------------------------------------
1Aabcdxyz
1Bqwertyuabcd
1Cwepopoy
1D oy
would be a sql approach.
A reader, November 23, 2004 - 1:34 pm UTC
"How many A records, B records, C records, etc -- or is there a map PER record"
1 record of each type per pk (pk+rec_type is unique)
Well, the map is per record type
Ax - Ay maps to Bm - Bn
for all A records
Think of it as one logical entity split over 5 physical records
November 23, 2004 - 2:28 pm UTC
sorry -- you've seen my "approach", i'm not following your lingo -- ax-ay maps to bm-bn ?
My approach using sql would be a variation on the above -- substr. See what you see with that, else I guess use whatever tool you see fit.
A reader, November 23, 2004 - 1:43 pm UTC
Your output doesnt match my required output.
1. If there is no mapping for the target position, it should be blank, the source record shouldnt "peek through".
2. The output shouldnt have the A record since it is not present in the target after mapping
3. How can I put the mapping into a table so that the query can be table driven?
4. In any case, can you please explain your solution so I can possibly adapt it for my needs?
Thanks
November 23, 2004 - 2:46 pm UTC
please -- use the approach, you should have everything you need to develop your own approach if you understood what I did.
I put the "old records" value there, should be pretty trivial for you to blank it out.
you can use a where clause to remove whatever you want.
All my thing is doing is making every record type (a,b,c,d,....) available to every row by PK value -- so PK = 1 and TYP=A record, it has the arec, brec, crec, drec, whatever.
A reader, November 23, 2004 - 3:53 pm UTC
Right, I understand your approach, my main question was how can I make it table-driven i.e. your query has 3 || on each columns because I specified 3 "transformation rules".
What if I want these mapping rules in a table and apply all the rules on each row? The table would look like
CREATE TABLE mapping
(
S_REC_TYPE VARCHAR2(1),
S_START INTEGER,
S_END INTEGER,
T_REC_TYPE VARCHAR2(1),
T_START INTEGER,
T_END INTEGER
)
Thanks
November 23, 2004 - 7:25 pm UTC
"join"
join by s_rec_type. you have your substr subscripts now.
in case you can tolerate another opinion ...
Gabe, November 23, 2004 - 5:15 pm UTC
To A reader
You seem to really want an answer here
if you can take a bit of criticism
this piecemeal approach to _fully_ and _clearly_ define the context of your question (plus the absence of resources
create table, etc.) doesnt usually help in getting a quick and complete answer from Tom. BTW, your requirements are still not fully and clearly specified.
In any case, maybe this will help a bit and get you started (my resources are at the end)
Source Data:
flip@FLOP> select * from src t order by substr(text,1,2);
TEXT
---------------------------------------------------------------
1Aabcdxyz
1Bqwertyu
1Cpopopoy
1Dabcdefghijkl
1E1234567890
2Amnbvcxz
2Blmnbgre
2Ciuytree
2Dlkjghfdsaert
2Ethfsgj123456
Maps:
flip@FLOP> select * from map_s2t_vw;
MAPPING_RULE
-------------------------------------------
A3-6 -> B10-13
B4-5 -> C3-4
C8-9 -> D6-7
D10-14 -> E6-10
Results:
flip@FLOP> col text format a30
flip@FLOP> select t.pk,t.tgt_rt,collapse(collapse_expr(t.field,t.tgt_from,t.tgt_thru)) text
2 from (
3 select s.text
4 ,substr(s.text,1,1) pk
5 ,substr(s.text,2,1) rt
6 ,substr(s.text,m.src_from,m.src_thru-m.src_from+1) field
7 ,m.tgt_rt
8 ,m.tgt_from
9 ,m.tgt_thru
10 from src s, map_s2t m
11 where substr(s.text,2,1) = m.src_rt
12 ) t
13 group by t.pk,t.tgt_rt
14 order by t.pk,t.tgt_rt;
P T TEXT
- - ------------------------------
1 B _________abcd
1 C __we
1 D _____oy
1 E _____hijkl
2 B _________mnbv
2 C __mn
2 D _____ee
2 E _____saert
Notes:
1. the results are not deterministic
the order for collapsing is not deterministic
you would have to modify the collapse_agg_type to implement a _controlled_ collapsing (I didnt do it because your requirement for ordering is ambiguous).
2. I put underscores instead of blanks (one can see them !!!)
3. There is a limit of 4000 characters.
4. I didn't bother to look into the correctness of odciaggregatemerge.
Resources:
create table src ( text varchar2(4000) not null );
create table map_s2t
( src_rt varchar2(1) not null check (src_rt in ('A','B','C','D','E'))
,src_from number(9) not null check (src_from>2)
,src_thru number(9) not null check (src_thru>2)
,tgt_rt varchar2(1) not null check (tgt_rt in ('A','B','C','D','E'))
,tgt_from number(9) not null check (tgt_from>2)
,tgt_thru number(9) not null check (tgt_thru>2)
,constraint chk_map check (src_thru-src_from = tgt_thru-tgt_from)
);
insert into map_s2t values ('A', 3, 6,'B',10,13);
insert into map_s2t values ('B', 4, 5,'C', 3, 4);
insert into map_s2t values ('C', 8, 9,'D', 6, 7);
insert into map_s2t values ('D',10,14,'E', 6,10);
create or replace view map_s2t_vw as
select decode( src_from,src_thru
,src_rt||to_char(src_from)||' -> '||tgt_rt||to_char(tgt_from)
,src_rt||to_char(src_from)||'-'||to_char(src_thru)||' -> '
||tgt_rt||to_char(tgt_from)||'-'||to_char(tgt_thru)
) mapping_rule
from map_s2t
;
--12345678901234
insert into src values ('1Aabcdxyz');
insert into src values ('1Bqwertyu');
insert into src values ('1Cpopopoy');
insert into src values ('1Dabcdefghijkl');
insert into src values ('1E1234567890');
insert into src values ('2Amnbvcxz');
insert into src values ('2Blmnbgre');
insert into src values ('2Ciuytree');
insert into src values ('2Dlkjghfdsaert');
insert into src values ('2Ethfsgj123456');
create or replace type collapse_expr as object
( str varchar2(4000)
,pos_from number(9)
,pos_thru number(9)
);
/
show errors
CREATE OR REPLACE TYPE collapse_agg_type AS OBJECT
(
str VARCHAR2 (4000),
STATIC FUNCTION odciaggregateinitialize (
sctx IN OUT collapse_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate (
SELF IN OUT collapse_agg_type,
ctx IN collapse_expr)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate (
SELF IN collapse_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge (
SELF IN OUT collapse_agg_type,
ctx2 collapse_agg_type)
RETURN NUMBER
);
/
show errors
CREATE OR REPLACE TYPE BODY collapse_agg_type
AS
STATIC FUNCTION odciaggregateinitialize (
sctx IN OUT collapse_agg_type)
RETURN NUMBER
IS
BEGIN
sctx := collapse_agg_type(null);
RETURN odciconst.success;
END;
MEMBER FUNCTION odciaggregateiterate (
SELF IN OUT collapse_agg_type,
ctx IN collapse_expr)
RETURN NUMBER
IS
len pls_integer;
BEGIN
if self.str is null then
self.str := lpad(ctx.str,ctx.pos_thru,'_');
else
len := length(self.str);
if len < ctx.pos_from then
self.str := self.str || lpad(ctx.str,ctx.pos_from-len-1,'_');
elsif len >= ctx.pos_thru then
self.str := substr(self.str,1,ctx.pos_from-1)||ctx.str||substr(self.str,ctx.pos_thru+1);
else
self.str := substr(self.str,1,ctx.pos_from-1)||ctx.str;
end if;
end if;
RETURN odciconst.success;
END;
MEMBER FUNCTION odciaggregateterminate (
SELF IN collapse_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER)
RETURN NUMBER
IS
BEGIN
returnvalue := SELF.str;
RETURN odciconst.success;
END;
MEMBER FUNCTION odciaggregatemerge (
SELF IN OUT collapse_agg_type,
ctx2 IN collapse_agg_type)
RETURN NUMBER
IS
BEGIN
SELF.str := SELF.str || ctx2.str;
RETURN odciconst.success;
END;
END;
/
show errors
CREATE OR REPLACE FUNCTION collapse ( ctx IN collapse_expr)
RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE
AGGREGATE USING collapse_agg_type;
/
show errors
Good Luck.
To Gabe
A reader, November 23, 2004 - 9:54 pm UTC
Gasp! That is exactly what I am looking for, wish I had put it that way the first time and not went back and forth with Tom.
1. Why do you say the collapsing is not deterministic?
2. I dont really care about odciaggregatemerge, I can just remove the parallel_enable?
3. Finally, I tried but didnt quite get how the aggregate stuff is working? What exactly does collapse(collapse_expr(t.field,t.tgt_from,t.tgt_thru)) do?
Thanks a lot!
To Gabe
A reader, November 23, 2004 - 10:40 pm UTC
I think I understand this now. I added another row to the mapping table
insert into map_s2t values ('A',7,9,'B', 3,5);
and did (in 10g)
set lines 100
col stragg format a40
col text format a40
SQL> select t.pk,t.tgt_rt,
2 cast(collect(t.field||'/'||t.tgt_from||'/'||t.tgt_thru) as varchar2_tab) stragg,
3 collapse(collapse_expr(t.field,t.tgt_from,t.tgt_thru)) text
4 from (
5 select s.text
6 ,substr(s.text,1,1) pk
7 ,substr(s.text,2,1) rt
8 ,substr(s.text,m.src_from,m.src_thru_m.src_from+1) field
9 ,m.tgt_rt
10 ,m.tgt_from
11 ,m.tgt_thru
12 from src s, map_s2t m
13 where substr(s.text,2,1) = m.src_rt
14 ) t
15 group by t.pk,t.tgt_rt
16 order by t.pk,t.tgt_rt;
P T STRAGG TEXT
_ _ __________________________________________________ ______________________________
1 B VARCHAR2_TAB('xyz/3/5', 'abcd/10/13') __xyzabcd
1 C VARCHAR2_TAB('we/3/4') __we
1 D VARCHAR2_TAB('oy/6/7') _____oy
1 E VARCHAR2_TAB('hijkl/6/10') _____hijkl
2 B VARCHAR2_TAB('cxz/3/5', 'mnbv/10/13') __cxzmnbv
2 C VARCHAR2_TAB('mn/3/4') __mn
2 D VARCHAR2_TAB('ee/6/7') _____ee
2 E VARCHAR2_TAB('saert/6/10') _____saert
8 rows selected.
I see that in the VARCHAR2_TAB('xyz/3/5', 'abcd/10/13'), the abcd/10/13 part is getting processed first and then the xyz/3/5 resulting in a final result of "__xyzabcd". The result should be "__xyz____abcd"
But how can the order in which the aggregate function gets it inputs be controlled?
Thanks
To Gabe
A reader, November 23, 2004 - 11:00 pm UTC
SQL> select t.pk,t.tgt_rt,
2 cast(collect(t.field||'/'||t.tgt_from||'/'||t.tgt_thru) as varchar2_tab) stragg,
3 collapse(collapse_expr(t.field,t.tgt_from,t.tgt_thru)) text
4 from (
5 select s.text
6 ,substr(s.text,1,1) pk
7 ,substr(s.text,2,1) rt
8 ,substr(s.text,m.src_from,m.src_thru-m.src_from+1) field
9 ,m.tgt_rt
10 ,m.tgt_from
11 ,m.tgt_thru
12 from src s, <b>(select * from map_s2t order by tgt_rt,tgt_from)</b> m
13 where substr(s.text,2,1) = m.src_rt
14 ) t
15 group by t.pk,t.tgt_rt
16 order by t.pk,t.tgt_rt;
P T STRAGG TEXT
- - -------------------------------------------------- ------------------------------
1 B VARCHAR2_TAB('abcd/10/13', 'xyz/3/5') --xyz----abcd
1 C VARCHAR2_TAB('we/3/4') --we
1 D VARCHAR2_TAB('oy/6/7') -----oy
1 E VARCHAR2_TAB('hijkl/6/10') -----hijkl
2 B VARCHAR2_TAB('mnbv/10/13', 'cxz/3/5') --cxz----mnbv
2 C VARCHAR2_TAB('mn/3/4') --mn
2 D VARCHAR2_TAB('ee/6/7') -----ee
2 E VARCHAR2_TAB('saert/6/10') -----saert
seems to work as I want. Did I just get lucky or is that what you meant by controlled collapsing?
good to see you got busy on your task ...
Gabe, November 24, 2004 - 12:07 am UTC
To "A reader" ...
Yes ... you did get lucky ... see Tom's approach
</code>
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:15637744429336#16551777586484 <code>
Basically you have to cache/array the fields during odciaggregateiterate and only assemble the final aggregate in odciaggregateterminate ... you can _controll_ the collapsing in here by sorting the cache/array based on whatever criteria you decide (you may have to extend the collapse_expr object to send more stuff in).
To Gabe
A reader, November 24, 2004 - 9:15 am UTC
OK even without the 'order by' on the select from map_s2t, how about if I do this?
if len < ctx.pos_from then
self.str := self.str || rpad('_',ctx.pos_from-len,'_') || lpad(ctx.str,ctx.pos_from-len-1,'_');
That also seems to solve my deterministic/ordering issue? i.e. the results are the same regardless of the ordering of the rows in the map table? Comments?
Thanks
P.S: This is a really creative use of user-defined aggregates. Never thought of using them to do arbitrary string "aggregation" in this fashion! Opens up a lot of possibilities, thanks!
How to Pass unix parameter to sqlplus within a create external table statement..
Ananth, January 14, 2005 - 11:53 am UTC
Unix variable to input location in external table is not passing to SQL location. Need you help .....
[oraerpp@test ap_inv]$ INPUT_DATA=`ls -mtr|cut -d',' -f1|head -1`
[oraerpp@test ap_inv]$ echo $INPUT_DATA
demo.dat
[oraerpp@test ap_inv]$ sqlplus apps/apps@erpp @ext_tab.sql '${INPUT_DATA}'
SQL*Plus: Release 9.2.0.3.0 - Production on Fri Jan 14 08:06:28 2005
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
Enter value for input_data:
old 22: location('&INPUT_DATA'))
new 22: location(''))
Table created.
SQL> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
[oraerpp@test ap_inv]$
CREATE TABLE apps.ext_tab (
empno VARCHAR2(5 BYTE),
ename VARCHAR2(25 BYTE),
job VARCHAR2(10 BYTE),
deptno VARCHAR2(5 BYTE)
)
ORGANIZATION EXTERNAL
(TYPE oracle_loader DEFAULT DIRECTORY LOG_DIR_AP_INV
ACCESS parameters
(records delimited BY newline
badfile LOG_DIR_AP_INV:'demodata%a_%p.bad'
discardfile LOG_DIR_AP_INV:'demodata%a_%p.dis'
LOGFILE LOG_DIR_AP_INV:'demodata%a_%p.log'
fields LRTRIM
MISSING FIELD VALUES ARE NULL
(EMPNO POSITION(1:5) CHAR
,ENAME POSITION(6:30) CHAR
,JOB POSITION(31:40) CHAR
,DEPTNO POSITION(41:45) CHAR
)
)
location('&INPUT_DATA'))
REJECT LIMIT UNLIMITED NOPARALLEL;
Need help to pass location as parameter from unix to sqlplus
January 14, 2005 - 7:59 pm UTC
it would be &1 as a parameter to the script, not the name of your environment variable.
put
define &INPUT_DATA=&1
at the top of your script.
Validating external tables
A reader, January 17, 2005 - 12:27 pm UTC
I have a set of 4 files that need to be cross-checked against each other to be considered "good". Also, each file has a header/trailer with sums to ensure that the file is self-consistent.
If I create these 4 files as external tables and do all the stuff above, the files, in effect, would be "loaded" multiple times during the course of doing all the validations above.
In this case, would it make sense to just use traditional sqlldr and load the files into "staging" tables instead of using the external table feature?
Thanks
January 17, 2005 - 5:18 pm UTC
insufficient data to answer -- but probably just use the ETs as if they were your staging tables. You'll be full scanning SOMETHING multiple times. so why add an EXTRA full scan (to load into stage)?
Full scanning something
A reader, January 17, 2005 - 7:55 pm UTC
"You'll be full scanning SOMETHING multiple times"
Right, but it doesnt quite "feel" right to invoke the access driver multiple times for the same file?
In terms of resource usage, wouldnt it be more efficient to do a traditional load into staging and then FTS the staging tables multiple times?
January 17, 2005 - 8:41 pm UTC
Maybe, maybe not -- I'd benchmark your situation....
ops$tkyte@ORA9IR2> drop table stage;
Table dropped.
ops$tkyte@ORA9IR2> REM drop table external_table;
ops$tkyte@ORA9IR2> REM drop table emp;
ops$tkyte@ORA9IR2> REM
ops$tkyte@ORA9IR2> REM create or replace directory data_dir as '/tmp/'
ops$tkyte@ORA9IR2> REM /
ops$tkyte@ORA9IR2> REM
ops$tkyte@ORA9IR2> REM create table external_table
ops$tkyte@ORA9IR2> REM (EMPNO NUMBER(4) ,
ops$tkyte@ORA9IR2> REM ENAME VARCHAR2(10),
ops$tkyte@ORA9IR2> REM JOB VARCHAR2(9),
ops$tkyte@ORA9IR2> REM MGR NUMBER(4),
ops$tkyte@ORA9IR2> REM HIREDATE DATE,
ops$tkyte@ORA9IR2> REM SAL NUMBER(7, 2),
ops$tkyte@ORA9IR2> REM COMM NUMBER(7, 2),
ops$tkyte@ORA9IR2> REM DEPTNO NUMBER(2)
ops$tkyte@ORA9IR2> REM )
ops$tkyte@ORA9IR2> REM ORGANIZATION EXTERNAL
ops$tkyte@ORA9IR2> REM ( type oracle_loader
ops$tkyte@ORA9IR2> REM default directory data_dir
ops$tkyte@ORA9IR2> REM access parameters
ops$tkyte@ORA9IR2> REM ( fields terminated by ',' )
ops$tkyte@ORA9IR2> REM location ('emp.dat')
ops$tkyte@ORA9IR2> REM )
ops$tkyte@ORA9IR2> REM /
ops$tkyte@ORA9IR2> REM
ops$tkyte@ORA9IR2> REM create table emp as select * from scott.emp where 1=0;
ops$tkyte@ORA9IR2> REM exec gen_data('EMP',50000);
ops$tkyte@ORA9IR2> REM insert /*+ append */ into emp select * from emp;
ops$tkyte@ORA9IR2> REM commit;
ops$tkyte@ORA9IR2> REM insert /*+ append */ into emp select * from emp;
ops$tkyte@ORA9IR2> REM commit;
ops$tkyte@ORA9IR2> REM insert /*+ append */ into emp select * from emp;
ops$tkyte@ORA9IR2> REM commit;
ops$tkyte@ORA9IR2> REM insert /*+ append */ into emp select * from emp;
ops$tkyte@ORA9IR2> REM commit;
ops$tkyte@ORA9IR2> REM insert /*+ append */ into emp select * from emp;
ops$tkyte@ORA9IR2> REM commit;
ops$tkyte@ORA9IR2> REM host flat / emp > /tmp/emp.dat
ops$tkyte@ORA9IR2> set autotrace on
ops$tkyte@ORA9IR2> select count(*) from external_table
2 /
COUNT(*)
----------
1600000
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 EXTERNAL TABLE ACCESS (FULL) OF 'EXTERNAL_TABLE'
Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
25 consistent gets
0 physical reads
0 redo size
380 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte@ORA9IR2> select count(*) from emp
2 /
COUNT(*)
----------
1600000
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'EMP'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
12619 consistent gets
7649 physical reads
420 redo size
380 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec runStats_pkg.rs_start;
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select count(*) from external_table;
COUNT(*)
----------
1600000
ops$tkyte@ORA9IR2> select count(*) from external_table;
COUNT(*)
----------
1600000
ops$tkyte@ORA9IR2> select count(*) from external_table;
COUNT(*)
----------
1600000
ops$tkyte@ORA9IR2> exec runStats_pkg.rs_middle;
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> create table stage as select * from external_table;
Table created.
ops$tkyte@ORA9IR2> select count(*) from stage;
COUNT(*)
----------
1600000
ops$tkyte@ORA9IR2> select count(*) from stage;
COUNT(*)
----------
1600000
ops$tkyte@ORA9IR2> select count(*) from stage;
COUNT(*)
----------
1600000
ops$tkyte@ORA9IR2> exec runStats_pkg.rs_stop(1000);
Run1 ran in 390 hsecs
Run2 ran in 780 hsecs
run 1 ran in 50% of the time
Name Run1 Run2 Diff
STAT...OS Page reclaims 0 1,038 1,038
STAT...OS User time used 0 1,049 1,049
STAT...db block gets 530 1,642 1,112
STAT...recursive calls 33 1,150 1,117
STAT...db block changes 1,024 2,230 1,206
STAT...redo entries 509 1,723 1,214
LATCH.redo allocation 509 1,735 1,226
LATCH.library cache 156 1,756 1,600
LATCH.simulator lru latch 1 2,365 2,364
LATCH.simulator hash latch 1 4,799 4,798
LATCH.multiblock read objects 0 9,450 9,450
STAT...physical writes 0 12,599 12,599
STAT...physical writes non che 0 12,599 12,599
STAT...physical writes direct 0 12,599 12,599
LATCH.row cache enqueue latch 26,826 10,310 -16,516
LATCH.row cache objects 40,236 14,906 -25,330
STAT...prefetched blocks 0 33,072 33,072
STAT...table scan blocks gotte 6 37,801 37,795
STAT...physical reads 0 37,797 37,797
LATCH.cache buffers lru chain 8 37,807 37,799
STAT...free buffer requested 4 37,808 37,804
STAT...no work - consistent re 14 37,986 37,972
STAT...buffer is not pinned co 75 38,258 38,183
STAT...consistent gets 82 38,511 38,429
STAT...session logical reads 612 40,153 39,541
STAT...session pga memory 0 65,536 65,536
STAT...session pga memory max 0 65,536 65,536
STAT...redo size 65,300 180,832 115,532
LATCH.cache buffers chains 2,698 121,499 118,801
STAT...table scan rows gotten 24 4,800,010 4,799,986
Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
70,908 209,249 138,341 33.89%
PL/SQL procedure successfully completed.
<b>this is by NO MEANS conclusive proof of anything</b>.... I would seriously consider a benchark -- since it is so easy to do.... just name your external table after your stage table name and see how it does...
Benchmark
A reader, January 17, 2005 - 9:39 pm UTC
But thats just the thing, its kinda hard to benchmark using these tools for measuring resource usage when using external tables. In the example you showed,
Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
25 consistent gets
0 physical reads
all the lios above are all due to reading the dictionary (8 recursive calls) to parse the access parameters and stuff. They have nothing to do with the resource usage of the access driver to actually load the file.
Comparing this to
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
12619 consistent gets
7649 physical reads
is not really apples-to-apples, is it?
The only relevant statistic regarding external table resource usage is wall-clock time, which is really useless except in a single-user system.
Thanks
January 18, 2005 - 9:36 am UTC
a benchmark is many things -- that is why I say sometimes "in general, when tuning a query, the goal is to reduce LIOs" -- but the keyword is in general. Consider:
ops$tkyte@ORA9IR2> create table t as select * from all_objects;
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set arraysize 2
ops$tkyte@ORA9IR2> set autotrace traceonly statistics;
ops$tkyte@ORA9IR2> select * from t;
27947 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
14176 consistent gets
383 physical reads
0 redo size
3418435 bytes sent via SQL*Net to client
154202 bytes received via SQL*Net from client
13975 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
27947 rows processed
ops$tkyte@ORA9IR2> select * from t order by 2;
27947 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
387 consistent gets
0 physical reads
0 redo size
3212746 bytes sent via SQL*Net to client
154202 bytes received via SQL*Net from client
13975 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
27947 rows processed
ops$tkyte@ORA9IR2> set autotrace off
Now, that second query -- where did the LIO's go? Well, they went away obviously, but where? they went to TEMP.
For you see, we don't do an LIO on TEMP -- we don't do a consistent get against temp, we don't NEED to.
Same with ET's -- we don't need to goto the buffer cache, we don't need to hit the buffer cache over and over.
So, which query is "better"?
Benchmark it -- which ever one in your environment runs best wins? Factors to consider are latching and such -- but hey, this is a bulk load so multi-user scaling doesn't count.
You need to look at what you are trying to do, benchmark different implementations (easy in this case since you just name your external table after your stage table and run the same code effectively "two times") and see which flys the best in your case.
Here, I was looking predominantly at "time to complete". If we just benchmark 3 full scans against an external table and a real table, the real table wins hands down. However, throw in the full scan of the external table and a load of a stage table followed by 3 full scans and the external table wins -- but then again, when you do something more complex than just full scan -- you join, hash, whatever -- you might find something different.
Benchmark
A reader, January 18, 2005 - 9:51 am UTC
Valid points, I will keep them in mind.
Another question:
When I do something like
select ...
from et1,et2,et3
where ...
i.e. treat the ext tables as real tables and join them, the only access path for a ext table is a full scan, so the above statement will always perform better with 3 real tables than 3 external tables, right?
Thanks
January 18, 2005 - 10:13 am UTC
maybe -- maybe not.
maybe not if before you do this -- you have to load 3 real tables!
remember, they will probably be hash joined -- so they are full scanned into memory, hashed, perhaps swapped out to temp. Just like a "real table". beyond the act of full scanning, what happens afterwards is the same.
If you have fixed width files (not delimited) you might even find the external table to be "as fast" as a real table (most of the time with external tables is spent parsing, using positional files -- which are larger -- parse faster)
problem with external table
yogi, January 21, 2005 - 6:50 pm UTC
Tom,
I have problem with using external table as follows:
create table external_table
(switch varchar2(200),
location varchar2(200),
npa varchar2(20)
)
organization external
(typ oracle_loader
default directory data_dir
access parameters
( records delimited by newline
nologfile
fields terminated by ','
missing field values are null
)
location('switch_npa.csv')
)
reject limit unlimited
/
I get following error .
ORA-29913: error in executing ODCIEXTTABLEFETCH callout.
Surprisingly, i don't get any other details following this error.
Can you please tell me what could be possible causes for this error ? Infact, I can get data from csv file using utl_file , using same directory. So it is not problem with file permissions.
January 21, 2005 - 8:19 pm UTC
I did have to change "TYP" to "TYPE" in the typ oracle_loader line...
but if that wasn't the case for you -- can you make a full test case like this fail?
ops$tkyte@ORA9IR2> create or replace directory data_dir as '/tmp'
2 /
Directory created.
ops$tkyte@ORA9IR2> !echo a,b,c > /tmp/switch_npa.csv
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table external_table
2 (switch varchar2(200),
3 location varchar2(200),
4 npa varchar2(20)
5 )
6 organization external
7 (type oracle_loader
8 default directory data_dir
9 access parameters
10 ( records delimited by newline
11 nologfile
12 fields terminated by ','
13 missing field values are null
14 )
15 location('switch_npa.csv')
16 )
17 reject limit unlimited
18 /
Table created.
ops$tkyte@ORA9IR2> select * from external_table;
SWITC LOCAT NPA
----- ----- --------------------
a b c
External table
A reader, January 22, 2005 - 2:00 pm UTC
thanks for the reply.
I still get 'ORA-29913: error in executing ODCIEXTTABLEFETCH callout'
Quick question on file permission.
The Oracle directory created for '/btd/sql/data' .I have read permission on '/btd/sql/data' ,dont have write permission. Does oracle need write permission on unix directory ? I use 'NOLOGFILE' in access parameters so that oracle does not create any log files on that directory.
I truly appreciate your help.
thanks.
January 22, 2005 - 4:31 pm UTC
yes, if you didn't specify a log/bad file location to be "elsewhere".
it'll try to create them there and fail.
you still have the "bad" files to worry about too -- best to set up another directory you do have write on and send them over there (they would be useful for you to figure out "whats wrong")
Validating data
A reader, January 31, 2005 - 4:07 pm UTC
I have a external table that needs to be validated against some business rules (rule1 thru ruleN) and data that violates the rules should be spit out
select
pk,
case when a+b!=c then 'rule1 violated'
case when x+y!=z then 'rule2 violated'
...
end
from
et
where
(
(a+b!=c) or
(x+y!=z) or
...
);
How can I avoid putting the rules in both the SELECT and the WHERE clauses?
Thanks
January 31, 2005 - 4:17 pm UTC
select *
from ( select pk, case when this then 'rule1 violated'
when that then 'rule2 violated'
....
end rule_check
from et
)
where rule_check IS NOT NULL;
A reader, January 31, 2005 - 4:21 pm UTC
Brilliant!
A reader, January 31, 2005 - 4:32 pm UTC
Elapsed time
A reader, January 31, 2005 - 4:44 pm UTC
Strange...
select
pk,
case when a+b!=c then 'rule1 violated'
case when x+y!=z then 'rule2 violated'
...
end
from
et
where
(
(a+b!=c) or
(x+y!=z) or
...
)
shows
Elapsed: 00:00:18.31
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
37312 consistent gets
34798 physical reads
60 redo size
7974 bytes sent via SQL*Net to client
1161 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6 rows processed
while
your version shows
Elapsed: 00:01:07.72
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
37311 consistent gets
34801 physical reads
0 redo size
995 bytes sent via SQL*Net to client
1174 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6 rows processed
The LIO/PIO are almost identical, so why does your version take so much longer elapsed time than mine?
Thanks
January 31, 2005 - 4:54 pm UTC
tkprof it. with 10046 level 12 (waits...)
and for grins, add rownum to the inline view first and a where rownum > 0 to the inline view.
Here is the tkprof with 10046 level 12
A reader, January 31, 2005 - 4:56 pm UTC
Also, I loaded the ET into a real table to remove any differences due to the access driver
Mine
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 14.07 18.43 34796 37312 0 6
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 14.09 18.45 34796 37312 0 6
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 24
Rows Row Source Operation
------- ---------------------------------------------------
6 TABLE ACCESS FULL REAL_TABLE
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
db file scattered read 634 0.05 7.16
db file sequential read 627 0.00 0.15
SQL*Net message from client 2 0.00 0.01
Yours
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 74.16 76.42 34788 37312 0 6
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 74.17 76.44 34788 37312 0 6
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 24
Rows Row Source Operation
------- ---------------------------------------------------
6 TABLE ACCESS FULL REAL_TABLE
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
db file sequential read 640 0.01 0.09
db file scattered read 623 0.04 6.57
SQL*Net message from client 2 0.00 0.01
********************************************************************************
January 31, 2005 - 5:56 pm UTC
can i hgave a real test case to play with?
did you do rownum like I asked?
rownum
A reader, January 31, 2005 - 5:19 pm UTC
If I add rownum stuff to the inline view and the outer query, it doesnt make it any faster, it just changes my row source operation counts
Rows Row Source Operation
------- ---------------------------------------------------
6 VIEW
1415090 COUNT
1415090 FILTER
1415090 TABLE ACCESS FULL REAL_TABLE
Rows Row Source Operation
------- ---------------------------------------------------
6 VIEW
1415090 COUNT
1415090 TABLE ACCESS FULL REAL_TABLE
My version is still 18sec vs your 75sec
Thanks
January 31, 2005 - 6:03 pm UTC
need test
Test case
VA, January 31, 2005 - 6:30 pm UTC
I will try to whip up a small concise complete test case, but the reason seems obvious to me?
Since Oracle fetches rows satisfying the WHERE clause and then applies any expressions in the SELECT list, my version is fetching 6 rows and applying my business rules (CASE ... END) to these 6 rows.
Your version fetches all 1.5 million rows and executes the CASE ... END for each of these rows and finally throws out all but 6 rows.
Wont that account for the large differences in elapsed time between the 2 approaches?
January 31, 2005 - 8:27 pm UTC
well,
select pk, case when this then 'x'
when that then 'y'
when the_other_thing then 'z'
end
from table
where this OR that OR the_other_thing
would do the where clause against 1.5 million rows and then apply the case (with the same "where's" basically.
this could be a situation where "where clause processing is much faster than doing the equivalent in a CASE"
Because, I do case against 1.5, then dump.
You do where against 1.5, then case against 6 then dump.
So, the only "obvious" thing to me would be CASE when processing is much slower than "where" because we both process 1.5 million rows with the same condition.
This is why we need test cases to see what you see..
For a simple test case for me (1,000,000 rows)
select *
from big_table
where object_id+5-to_char(created,'ddyyyy') > 1000000
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 1.51 1.55 11789 14381 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 1.51 1.55 11789 14381 0 0
********************************************************************************
select *
from (
select big_table.* ,
case when object_id+5-to_char(created,'ddyyyy') > 1000000 then 1 end cond
from big_table
)
where cond is not null
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 1.65 1.66 11789 14381 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 1.65 1.67 11789 14381 0 0
shows no significant difference (run this again and the timings are reversed)
select *
from big_table
where object_id+5-to_char(created,'ddyyyy') > 1000000
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 1.71 1.73 11791 14381 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 1.71 1.73 11791 14381 0 0
********************************************************************************
select *
from (
select big_table.* ,
case when object_id+5-to_char(created,'ddyyyy') > 1000000 then 1 end cond
from big_table
)
where cond is not null
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 1.63 1.66 11790 14381 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 1.63 1.66 11790 14381 0 0
so there must be something special about your case, and I'm curious....
Test case
A reader, January 31, 2005 - 10:40 pm UTC
Dont know how I can make a test case, its a plain table with about 100 columns and 1.5 mill rows. The CASE statement does stuff like
case
when the_type=0 and a+b!=c then 'rule1 violated'
when the_type=1 and a+b+c!=d then 'rule2 violated'
...
there are 6 such WHEN conditions
Why would my results be different than yours? Do you know of other instances where "WHERE clause procesing is faster than the same thing in a CASE"?
Thanks
February 01, 2005 - 8:39 am UTC
show me the actual query then.
Look -- I'm willing to look into it, try to figure out, file a bug if need be
But -- I need help reproducing your issue. I just showed above "in general, not always true".
Show us the counter case.
Actual query
A reader, February 01, 2005 - 9:07 am UTC
Here is your version
SELECT * FROM (
SELECT
CASE
WHEN transaction_code = 'Z' AND adjustment_type = 0
THEN transaction_id || ' adjustment trade with no adjustment'
WHEN transaction_code != 'Z' AND adjustment_type != 0
THEN transaction_id || ' non-adjustment trade with an adjustment'
WHEN adjustment_type = 0 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains unexpected adjustment'
WHEN adjustment_type = 1 AND (oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains more than OID on sale adjustment'
WHEN adjustment_type = 2 AND (oid_on_sale_amt + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains more than OID on deliver adjustment'
WHEN adjustment_type = 3 AND (oid_on_sale_amt + oid_on_deliver_amount + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains more than OID owned adjustment'
WHEN adjustment_type = 4 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_adj_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains more than MTM income adjustment'
WHEN adjustment_type = 5 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains more than MTM adjustment'
WHEN adjustment_type = 6 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt) != 0
THEN transaction_id || ' contains more than bond adjustment'
WHEN adjustment_type NOT IN (0, 1, 2, 3, 4, 5, 6)
THEN transaction_id || ' contains unknown adjustment type'
ELSE NULL
END error_message,a.transaction_id
FROM stagetransactions a
WHERE run_sequence = -1
)
WHERE error_message IS NOT NULL;
Here is my version
SELECT
CASE
WHEN transaction_code = 'Z' AND adjustment_type = 0
THEN transaction_id || ' adjustment trade with no adjustment'
WHEN transaction_code != 'Z' AND adjustment_type != 0
THEN transaction_id || ' non-adjustment trade with an adjustment'
WHEN adjustment_type = 0 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains unexpected adjustment'
WHEN adjustment_type = 1 AND (oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains more than OID on sale adjustment'
WHEN adjustment_type = 2 AND (oid_on_sale_amt + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains more than OID on deliver adjustment'
WHEN adjustment_type = 3 AND (oid_on_sale_amt + oid_on_deliver_amount + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains more than OID owned adjustment'
WHEN adjustment_type = 4 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_adj_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains more than MTM income adjustment'
WHEN adjustment_type = 5 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + adj_to_bond_amt) != 0
THEN transaction_id || ' contains more than MTM adjustment'
WHEN adjustment_type = 6 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt) != 0
THEN transaction_id || ' contains more than bond adjustment'
WHEN adjustment_type NOT IN (0, 1, 2, 3, 4, 5, 6)
THEN transaction_id || ' contains unknown adjustment type'
ELSE NULL
END error_message,a.transaction_id
FROM stagetransactions a
WHERE
(
(transaction_code = 'Z' AND adjustment_type = 0) OR
(transaction_code != 'Z' AND adjustment_type != 0) OR
(adjustment_type = 0 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0) OR
(adjustment_type = 1 AND (oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0) OR
(adjustment_type = 2 AND (oid_on_sale_amt + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0) OR
(adjustment_type = 3 AND (oid_on_sale_amt + oid_on_deliver_amount + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0) OR
(adjustment_type = 4 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_adj_amt + adj_to_bond_amt) != 0) OR
(adjustment_type = 5 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + adj_to_bond_amt) != 0) OR
(adjustment_type = 6 AND (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt) != 0) OR
(adjustment_type NOT IN (0, 1, 2, 3, 4, 5, 6))
)
Thanks
February 01, 2005 - 2:00 pm UTC
I'll need more help reproducing...
drop table stagetransactions;
create table stagetransactions ( transaction_code varchar2(1), adjustment_type number, transaction_id number ,
oid_on_sale_amt number, oid_on_deliver_amount number, oid_owned_amt number, mtm_income_amt number, mtm_adj_amt number, adj_to_bond_amt number , run_sequence
number )
/
insert into stagetransactions
select 'Z', 1, 42, 0, 0, 0, 0, 0, 0, -1
from big_table.big_table where rownum <= 1500000;
SELECT * FROM (
SELECT
CASE
WHEN transaction_code = 'Z' AND adjustment_type = 0
THEN transaction_id || ' adjustment trade with no adjustment'
...
THEN transaction_id || ' contains unknown adjustment type'
ELSE NULL
END error_message,a.transaction_id
FROM stagetransactions a
WHERE run_sequence = -1
)
WHERE error_message IS NOT NULL
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 2.97 2.94 921 6049 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 2.97 2.94 921 6049 0 0
********************************************************************************
SELECT
CASE
WHEN transaction_code = 'Z' AND adjustment_type = 0
THEN transaction_id || ' adjustment trade with no adjustment'
...
THEN transaction_id || ' contains unknown adjustment type'
ELSE NULL
END error_message,a.transaction_id
FROM stagetransactions a
WHERE
(
(transaction_code = 'Z' AND adjustment_type = 0) OR
...
(adjustment_type NOT IN (0, 1, 2, 3, 4, 5, 6))
)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 2.51 2.48 524 6049 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 2.51 2.48 524 6049 0 0
A reader, February 01, 2005 - 9:12 am UTC
Ignore the ELSE NULL part in the CASE expression above, I thought it might make a difference but it didnt
A reader, February 01, 2005 - 2:29 pm UTC
Sigh...dont know why we are getting such different results. I did the exact same thing
SQL> drop table t;
Table dropped.
1 create table t as select
2 adj_to_bond_amt,
3 adjustment_type,
4 mtm_adj_amt,
5 mtm_income_amt,
6 oid_on_deliver_amount,
7 oid_on_sale_amt,
8 oid_owned_amt,
9 transaction_code,
10 transaction_id,
11 run_sequence
12* from stagetransactions
SQL> /
Table created.
SQL> select count(*) from t;
COUNT(*)
----------
1415090
SQL> set autotrace traceonly
SQL> /
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'T'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6465 consistent gets
6462 physical reads
0 redo size
494 bytes sent via SQL*Net to client
651 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Then I ran the 2 queries above and got
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 61.51 60.41 6462 6466 0 6
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 61.52 60.42 6462 6466 0 6
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 24
Rows Row Source Operation
------- ---------------------------------------------------
6 TABLE ACCESS FULL T
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
db file scattered read 101 0.06 0.81
SQL*Net message from client 2 0.02 0.03
for your version
and
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 10.02 10.09 6462 6466 0 6
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 10.04 10.10 6462 6466 0 6
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 24
Rows Row Source Operation
------- ---------------------------------------------------
6 TABLE ACCESS FULL T
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
db file scattered read 101 0.10 0.78
SQL*Net message from client 2 0.00 0.00
********************************************************************************
for mine
LIOs are identical but huge difference in elapsed time.
Anything else I can look at? init.ora parameters or something? Thanks
February 01, 2005 - 3:45 pm UTC
what is your version, what is your machine (i was doing this on a desktop, a year old desktop -- your machine seems "slow" from the get go)
how big would that dmp file for that table be after being compressed and would you be able to send it?
(DO NOT SEND it just let me know the details)
one more thing, I wonder -- is this a sql_trace bug.... there was one where cpu time would be "over done" and over done differently for different queries.
use this:
------------- mystat --------------
set echo off
set verify off
column value new_val V
define S="&1"
set autotrace off
select a.name, b.value
from v$statname a, v$mystat b
where a.statistic# = b.statistic#
and lower(a.name) like '%' || lower('&S')||'%'
/
set echo on
-----------------------------------
------------ mystat2 ------------
set echo off
set verify off
select a.name, b.value V, b.value-&V diff
from v$statname a, v$mystat b
where a.statistic# = b.statistic#
and lower(a.name) like '%' || lower('&S')||'%'
/
set echo on
----------------------------------------
and test like this:
ops$tkyte@ORA9IR2> @mystat "cpu used by this session"
ops$tkyte@ORA9IR2> set echo off
NAME VALUE
------------------------------ ----------
CPU used by this session 568
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> SELECT * FROM (
2 SELECT
3 CASE
4 WHEN transaction_code = 'Z' AND adjustment_type = 0
5 THEN transaction_id || ' adjustment trade with no adjustment'
...
25 END error_message,a.transaction_id
26 FROM stagetransactions a
27 WHERE run_sequence = -1
28 )
29 WHERE error_message IS NOT NULL;
no rows selected
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> @mystat2
ops$tkyte@ORA9IR2> set echo off
NAME V DIFF
------------------------------ ---------- ----------
CPU used by this session 856 288
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> @mystat "cpu used by this session"
ops$tkyte@ORA9IR2> set echo off
NAME VALUE
------------------------------ ----------
CPU used by this session 856
ops$tkyte@ORA9IR2> SELECT
2 CASE
3 WHEN transaction_code = 'Z' AND adjustment_type = 0
...
37 (adjustment_type NOT IN (0, 1, 2, 3, 4, 5, 6))
38 )
39 /
no rows selected
ops$tkyte@ORA9IR2> @mystat2
ops$tkyte@ORA9IR2> set echo off
NAME V DIFF
------------------------------ ---------- ----------
CPU used by this session 1110 254
A reader, February 01, 2005 - 3:53 pm UTC
Oracle 9.2.0.6
Machine is Sun Enterprise 450 running Solaris 8
$ uname -a
SunOS xxxx 5.8 Generic_117350-14 sun4u sparc SUNW,Ultra-4
$ prtconf -v|head
System Configuration: Sun Microsystems sun4u
Memory size: 4096 Megabytes
System Peripherals (Software Nodes):
SUNW,Ultra-4
SQL> show sga
Total System Global Area 1499435184 bytes
Fixed Size 733360 bytes
Variable Size 436207616 bytes
Database Buffers 1056964608 bytes
Redo Buffers 5529600 bytes
pga_aggregate_target big integer 1048576000
db_cache_size big integer 1056964608
shared_pool_size big integer 218103808
ls -l expdat.dmp.gz for that 1 table is 4375307 bytes
mpstat
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 2 0 333 400 300 57 0 1 1 0 130 1 1 1 98
1 2 0 33 160 159 93 0 3 3 0 96 0 1 1 98
2 2 0 29 103 101 95 0 3 1 0 139 1 1 1 98
3 2 0 24 129 127 100 0 3 1 0 139 1 1 1 98
Thanks
February 01, 2005 - 4:07 pm UTC
go ahead, email the dump file (tkyte@oracle.com). I might not look at it for a day or two, just getting on a plane to go far away...
....
Ok, turns out it was your specific set of data.
1* select adjustment_type, count(*) from t group by adjustment_type
ops$tkyte@ORA9IR2> /
A COUNT(*)
- ----------
1 6
2 2
3 1
5 2
6 6
1415073
6 rows selected.
almost all of your adjustment_types were null, the where clause was better able to "remove them from consideration".
Adding a simple "when then" to the case, makes the case by itself faster (than the where + case)
ops$tkyte@ORA9IR2> SELECT * FROM (
2 SELECT
3 CASE<b>
4 when adjustment_type is null
5 then null</b>
6 WHEN transaction_code = 'Z' AND adjustment_type = 0
7 THEN transaction_id || ' adjustment trade with no adjustment'
8 WHEN transaction_code != 'Z' AND adjustment_type != 0
9 THEN transaction_id || ' non-adjustment trade with an adjustment'
10 WHEN (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0 and adjustment_type=0
11 THEN transaction_id || ' contains unexpected adjustment'
12 WHEN (oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0 and adjustment_type=1
13 THEN transaction_id || ' contains more than OID on sale adjustment'
14 WHEN (oid_on_sale_amt + oid_owned_amt + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0 and adjustment_type=2
15 THEN transaction_id || ' contains more than OID on deliver adjustment'
16 WHEN (oid_on_sale_amt + oid_on_deliver_amount + mtm_income_amt + mtm_adj_amt + adj_to_bond_amt) != 0 and adjustment_type=3
17 THEN transaction_id || ' contains more than OID owned adjustment'
18 WHEN (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_adj_amt + adj_to_bond_amt) != 0 and adjustment_type=4
19 THEN transaction_id || ' contains more than MTM income adjustment'
20 WHEN (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + adj_to_bond_amt) != 0 and adjustment_type=5
21 THEN transaction_id || ' contains more than MTM adjustment'
22 WHEN (oid_on_sale_amt + oid_on_deliver_amount + oid_owned_amt + mtm_income_amt + mtm_adj_amt) != 0 and adjustment_type=6
23 THEN transaction_id || ' contains more than bond adjustment'
24 WHEN adjustment_type NOT IN (0, 1, 2, 3, 4, 5, 6)
25 THEN transaction_id || ' contains unknown adjustment type'
26 ELSE NULL
27 END error_message,a.transaction_id, rownum r
28 FROM t a
29 WHERE run_sequence = -1
30 )
31 WHERE error_message IS NOT NULL;
sql_trace
A reader, February 01, 2005 - 3:59 pm UTC
I dont think sql_trace has anything to do with this
even if I do just
set timing on
without any tracing, one SQL is much much slower than the other
February 02, 2005 - 4:40 am UTC
see above....
monitor file system growth using external tables
N.Balasubramanian, February 02, 2005 - 12:41 am UTC
Hi Tom
I am using HP-UX. I want the output of the bdf command loaded into a table using external table. Following the example given in your book I am able to do it successfully. However, for certain filesystems, where the filesystem name is long, the result is displayed in two lines. In such cases I have to manually alter the flat file. How to overcome this so that I can automate the entire process
February 02, 2005 - 5:02 am UTC
well, if you want the long name -- you'll have to bdf | awk ... and fix it up.
if you just want the truncated name, bdf | grep to filter out the ones you don't need.
else, you'll be writing a plsql loop or using lag/lead to fix it up.
but not having an example to work with, well, makes it hard.
I am using awk
N.Balasubramanian, February 02, 2005 - 7:06 am UTC
I am using awk to make the comma separated file.
But still it doesn't work.
The output of bdf is as follows.
Filesystem kbytes used avail %used Mounted on
/dev/vg00/lvol3 155648 143743 11349 93% /
/dev/vg00/lvol1 63765 27760 29628 48% /stand
/dev/vg02/ltraceora
2097152 1213714 835104 59% /traceora
/dev/vg00/lvol5 524288 119150 380713 24% /tmp
/dev/vg02/ltestzip 5120000 2863016 2116654 57% /testzip
/dev/vg01/lorarac 5324800 3392018 1823235 65% /rac9i
/dev/vg02/lcrashdmp
4194304 2018936 2039445 50% /q4crash
/dev/vg02/lpstage 4194304 2026629 2032213 50% /pstage
/dev/vg01/lotrace 716800 714377 2423 100% /oratrace
/dev/vg02/lorapatch
6553600 6346021 194607 97% /orapatch
/dev/vg02/ltestora 5632000 4692376 882518 84% /ora9i
Moreover, I want all the file systems loaded into the table. So I can't use grep -v to avoid the long ones.
Please help.
Thanks
February 02, 2005 - 7:56 am UTC
so modify your awk so that if the kbytes are missing, don't output, get the next line instead (remembering the filesystem) and then output the filesystem kbytes, etc....
Thank You
N.Balasubramanian, February 02, 2005 - 8:28 am UTC
Thank you Tom. I will try that.
Brilliant!
VA, February 02, 2005 - 9:35 am UTC
"almost all of your adjustment_types were null, the where clause was better able to "remove them from consideration""
Your version works like a champ now, much faster than than mine.
But I am not sure I understand what happened here.
My WHERE clause had stuff where all predicates involved equality/inequality using adjustment_type. Since this was NULL for most of the rows, are you saying that Oracle intelligently skipped these rows without evaluating the predicate at all and thats what resulted in the speed boost for my version?
February 02, 2005 - 9:49 am UTC
I think (guessing here) that the where clause was able to abort evaluation since adjustment_type was used in the entire thing. (each OR branch returns UNKNOWN if it is null)
the case statement would have tried each of the when's (it isn't a predicate tree, it is a big "function")
by handling the case that would make all other cases moot -- we blow out of the case early - just like the where clause did (faster even)
can not access in procedure
A reader, February 04, 2005 - 5:13 pm UTC
Tom,
I have created a directory object ref_log_dir as '/bsd/log';
I can use this directory from independent block as following :
SQL> declare
2 v_output_file utl_file.file_type;
3 begin
4 v_output_file := utl_file.fopen( 'REF_LOG_DIR', 'convert.log', 'a');
5 utl_file.put( v_output_file, 'testing' || USER);
6 utl_file.new_line(v_output_file);
7 utl_file.fclose(v_output_file );
8 end;
9 /
PL/SQL procedure successfully completed.
If use this directory in a procedure then i get access denied error;
1 create or replace procedure test_dir IS
2 v_output_file utl_file.file_type;
3 begin
4 v_output_file := utl_file.fopen( 'REF_LOG_DIR', 'convert.log', 'a
5 utl_file.put( v_output_file, 'testing' || USER);
6 utl_file.new_line(v_output_file);
7 utl_file.fclose(v_output_file );
8* end;
SQL> /
Procedure created.
SQL> execute test_dir;
BEGIN test_dir; END;
*
ERROR at line 1:
ORA-29289: directory access denied
ORA-06512: at "SYS.UTL_FILE", line 18
ORA-06512: at "SYS.UTL_FILE", line 424
ORA-06512: at "OMSCONV.TEST_DIR", line 4
ORA-06512: at line 1
Do I have give any other grants/permission to directory object in order to use it in a procedure ?
thanks.
February 05, 2005 - 5:17 am UTC
External Table files are on client machine
Praveen, March 10, 2005 - 5:00 am UTC
Tom,
I am trying to write a procedure which will accept a string as IN parameter which is a directory path. Using this path a 'directory' object is created. External tables are then created based on this path (which will later take part in complex ETL process). Platform 9iR2 on Windows 2000.
a) Is it possible to create directory objects based on network paths? One cannot say in which client machine, the files upon which the external tables are created, resides at a particular time.
b) If not possible, any work-around for the above problem?
Regards
Praveen
external tables : field separator and date_format
laurent, March 17, 2005 - 12:45 pm UTC
Hi Tom,
I'm trying to create an external table for the following file sample:
assurance3-x_XGDR00 C /base/oracle/XGDR00 06 Mar 05 08:05:08 06 Mar 05 11:28:00
assurance1-x_pro C /base/oracle/pro 06 Mar 05 08:07:05 06 Mar 05 12:49:00
I need to have it that way (comma separare the fields I want)
assurance3-x_XGDR00,C,/base/oracle/XGDR00,06 Mar 05 08:05:08,06 Mar 05 11:28:00
I think the problem is that my field separator ' ' also appears in the dates fields.
I tried the following but I got an error when I select from the table :
create or replace directory FIT_DIR AS '/projets/fit/home/fitbatch/rp';
grant read on directory FIT_DIR to fit00batch;
create table ext_tina_jour_save_ctrlm (
Client VARCHAR2(30),
Strategie VARCHAR2(10),
Repertoire_sauvegarde VARCHAR2(255),
Date_debut date,
Date_fin date
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY FIT_DIR
ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE
NOLOGFILE
FIELDS TERMINATED BY ' '
(client,
strategie,
Repertoire_sauvegarde,
Date_debut date_format date mask "dd-mon-yy hh24:mi:ss",
Date_fin date_format date mask "dd mon yy hh24:mi:ss"
)
LRTRIM )
LOCATION ('tina_psa089_jour_save_ctrlm')
);
SQL> select * from ext_tina_jour_save_ctrlm
2 ;
select * from ext_tina_jour_save_ctrlm
*
ERREUR à la ligne 1 :
ORA-29913: erreur d'exécution de l'appel ODCIEXTTABLEOPEN
ORA-29400: erreur de cartouche de données
KUP-00554: error encountered while parsing access parameters
KUP-01005: syntax error: found "date_format": expecting one of: "comma, char,
date, defaultif, decimal, double, float, integer, (, nullif, oracle_date,
oracle_number, position, raw, recnum, ), unsigned, varrawc, varchar, varraw,
varcharc, zoned"
KUP-01007: at line 7 column 41
ORA-06512: à "SYS.ORACLE_LOADER", ligne 14
ORA-06512: à ligne 1
thanks for your help and sharing your knowledge with us on this website.
Laurent
error from my 1st post
laurent, March 17, 2005 - 12:59 pm UTC
sorry I copy-pasted the wrong create table :/
This one works gives me no error but no lines are retreived from the file
SQL> select * from ext_tina_jour_save_ctrlm;
aucune ligne sélectionnée
SQL>
create table ext_tina_jour_save_ctrlm (
Client VARCHAR2(30),
Strategie VARCHAR2(10),
Repertoire_sauvegarde VARCHAR2(255),
Date_debut date,
Date_fin date
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY FIT_DIR
ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE
NOLOGFILE
FIELDS TERMINATED BY ' '
(client,
strategie,
Repertoire_sauvegarde,
Date_debut char date_format date mask "dd-mon-yy hh24:mi:ss",
Date_fin char date_format date mask "dd mon yy hh24:mi:ss"
)
)
LOCATION ('tina_psa089_jour_save_ctrlm')
)
reject limit unlimited;
March 17, 2005 - 2:02 pm UTC
if you say "fields delimited by ' '" and you have spaces in your fields??? not sure what I can say.
any chance?
laurent, March 17, 2005 - 6:43 pm UTC
well I guess I will have to edit my file using perl to remove spaces into the dates fields
Is there by chance any way I could load that file into an external table without editing it, like using fields delimited by ' ' for the first 3 and fixed length for the last 2 (18 characters for each date) ?
thanks again
March 17, 2005 - 6:49 pm UTC
no reason you cannot map the entire line as a single varchar2(4000) and just use SQL on it:
ops$tkyte@ORA9IR2> select substr( x, 1, instr( x, ' ', 1, 1 ) ) ,
2 substr( x, instr( x, ' ', 1, 1 )+1,
instr( x, ' ', 1, 2 )-instr(x, ' ', 1, 1 ) ) ,
3 substr( x, instr( x, ' ', 1, 2 )+1,
instr( x, ' ', 1, 3 )-instr(x, ' ', 1, 2 ) ) ,
4 substr( x, instr( x, ' ', 1, 3 )+1, 18 ),
5 substr( x, instr( x, ' ', 1, 3 )+1+19 )
6 from t
7 /
SUBSTR(X,1,INSTR(X,'',1,1))
-------------------------------------------------------------------------------
SUBSTR(X,INSTR(X,'',1,1)+1,INSTR(X,'',1,2)-INSTR(X,'',1,1))
-------------------------------------------------------------------------------
SUBSTR(X,INSTR(X,'',1,2)+1,INSTR(X,'',1,3)-INSTR(X,'',1,2))
-------------------------------------------------------------------------------
SUBSTR(X,INSTR(X,'
------------------
SUBSTR(X,INSTR(X,'',1,3)+1+19)
-------------------------------------------------------------------------------
assurance3-x_XGDR00
C
/base/oracle/XGDR00
06 Mar 05 08:05:08
06 Mar 05 11:28:00
External Table Problem
Faisal, June 16, 2005 - 10:41 am UTC
Hi Tom,
I have a problem creating external table dynamically using stored procedure. Here is the info.
I have one external table template.
CREATE TABLE CLM_HWINGCLMAC_EXT
(
TEST_COL1 VARCHAR2(5 BYTE),
TEST_COL2 VARCHAR2(5 BYTE)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY EISHCLM_DIR
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE
BADFILE eishclm_dir:'hwingclmac.bad'
FIELDS RTRIM
( test_col1 (1:5),
test_col2 (6:10))
)
LOCATION (EISHCLM_DIR:'hwingclmac.ctl')
)
REJECT LIMIT UNLIMITED;
Following block of code works fine without any problem.
DECLARE
string1 VARCHAR2(2000);
string2 VARCHAR2(2000);
tname VARCHAR2(30) := 'CLM_HWINGCLMAC_EXT';
w_table_owner VARCHAR2(30) := 'CLAIM';
ttype VARCHAR2(30) := 'TABLE';
tnewname VARCHAR2(30)
:= 'CLM_HWINGCLMAC_EXT_NEW';
w_file_name VARCHAR2(30) := 'hwingclmac.ctl';
filedir VARCHAR2(30) := 'EISHCLM_DIR';
BEGIN
SELECT REPLACE(DBMS_METADATA.get_ddl(ttype
,tname
,w_table_owner
)
,tname
,tnewname
)
INTO string1
FROM DUAL;
string2 :=
'alter table '
|| tnewname
|| ' LOCATION ('
|| filedir
|| ':'
|| ''''
|| w_file_name
|| ''''
|| ')';
EXECUTE IMMEDIATE string1;
EXECUTE IMMEDIATE string2;
END;
/
But I did not work using stored procedure
CREATE OR REPLACE PROCEDURE create_ext_tables IS
string1 VARCHAR2(2000);
string2 VARCHAR2(2000);
tname VARCHAR2(30) := 'CLM_HWINGCLMAC_EXT';
w_table_owner VARCHAR2(30) := 'CLAIM';
ttype VARCHAR2(30) := 'TABLE';
tnewname VARCHAR2(30)
:= 'CLM_HWINGCLMAC_EXT_NEW';
w_file_name VARCHAR2(30) := 'hwingclmac.ctl';
filedir VARCHAR2(30) := 'EISHCLM_DIR';
BEGIN
SELECT REPLACE(DBMS_METADATA.get_ddl(ttype
,tname
,w_table_owner
)
,tname
,tnewname
)
INTO string1
FROM DUAL;
string2 :=
'alter table '
|| tnewname
|| ' LOCATION ('
|| filedir
|| ':'
|| ''''
|| w_file_name
|| ''''
|| ')';
EXECUTE IMMEDIATE string1;
EXECUTE IMMEDIATE string2;
END;
/
I got the following error
ORA-01031: insufficient privileges
ORA-06512: at "CLAIM.CREATE_EXT_TABLES", line 33
ORA-06512: at line 2
Please help me!!!
June 16, 2005 - 1:10 pm UTC
More Info
A reader, June 16, 2005 - 2:27 pm UTC
Hi Tom,
I am sorry I missed one point, I can create the stored proceure with out any error, error appears when procedure try to create external table using execute immediate.
Regards,
June 16, 2005 - 2:46 pm UTC
see same link. It is exactly that reason.
Thanks
Faisal, June 16, 2005 - 4:12 pm UTC
Hi Tom,
Problem resolved, thanks a million!!!
Your suggestion Required
Faisal, June 17, 2005 - 6:07 pm UTC
Hi Tom,
The way I am creating dynmaic external on daily basis is the right approch or there is a better way to do that, need your openion.
CREATE OR REPLACE PROCEDURE create_ext_tables IS
string1 VARCHAR2(2000);
string2 VARCHAR2(2000);
tname VARCHAR2(30) :=
'CLM_HWINGCLMAC_EXT';
w_table_owner VARCHAR2(30) :=
'CLAIM';
ttype VARCHAR2(30) :=
'TABLE';
tnewname VARCHAR2(30)
:=
'CLM_HWINGCLMAC_EXT_NEW';
w_file_name VARCHAR2(30) :=
'hwingclmac.ctl';
filedir VARCHAR2(30) :=
'EISHCLM_DIR';
BEGIN
SELECT REPLACE(DBMS_METADATA.get_ddl(ttype
,tname
,w_table_owner
)
,tname
,tnewname
)
INTO string1
FROM DUAL;
string2 :=
'alter table '
|| tnewname
|| ' LOCATION ('
|| filedir
|| ':'
|| ''''
|| w_file_name
|| ''''
|| ')';
EXECUTE IMMEDIATE string1;
EXECUTE IMMEDIATE string2;
END;
Thanks,
June 17, 2005 - 6:59 pm UTC
I don't know - what is your "goal"
More Info.
Faisal, June 17, 2005 - 9:39 pm UTC
Hi Tom,
For my DWH project, We have two source systems, each source systems have 4 companies, each company will transfer 5 flat files (different structure) to DWH. On daily basis, we will receive (2 source systems X 8 companies X 5 types of files = 80 files). The structure of flat files of source A is different from source B, but within the source, structure of files is identical for the companies. So basically I have two different sets of files for multiple companies. Now, in order to load the data in DWH using external tables, I am planning to create two sets of external tables (templates) i.e. I will have 10 external tables permanantly. Once I will receive the data of source A and company C1, I will create the external tables using the pre defined meta data table as I am doing in the above procedure, load the data and then drop the external tables. To achive this functionality, do you think the procedure I wrote is the right approch?
June 18, 2005 - 7:11 pm UTC
why do you need to copy the external table definition? why not just alter the file name and load from it? (from the one)
A reader, June 19, 2005 - 10:09 am UTC
Because we have to load muliple companies in parallel, so I can not create one defination, I need seperate tables for each company and source systems.
Regards,
Faisal
June 19, 2005 - 11:47 am UTC
ok, so create them all at once. why do you need to create them dynamically?
Faisal, June 19, 2005 - 1:42 pm UTC
Becuase if there is structure change (add new column) then I have to update 8 tables (identical table for 8 companies) but if I have one template (each for source) then the change will be on one place i.e. template. This is the only reason I am going to dynmaic approch.
June 19, 2005 - 2:06 pm UTC
"The way I am creating dynmaic external on daily basis "
why on a daily basis.
No problem with scripting the creation, but why create them every single day???? Why not just create them once and when they change, not every time?
Faisal, June 19, 2005 - 9:45 pm UTC
Hi Tom,
I understand your point, my only objective is to change the defination of template external table once, and the copy will be created on fly, but I think you are right, there is no need to drop and create them on daily basis, I can create them once which are 80 tables, and if there is any change which will be quite often because it is the start of project but after some time it will be stable and really there is no worth to do dynnamic. Thanks for guidence....
Faisal,
Implicit Ordering for External Tables
John Gilmore, June 28, 2005 - 7:35 am UTC
Hi Tom,
I know that there is no explicit ordering of data in a normal table but can I assume that data read by SQL*Loader from an external flat file will be ordered in the same order as the physical file?
What I'm trying to do is read from an external table into a physical table. In the process, I want to include a pseudo-column based on a sequence so that selecting from the physical table (and ordering by the sequence column) will give the same order that is present in the external file. Will this work or is it not supported? Any suggestions for a workaround would be gratefully accepted.
June 28, 2005 - 8:06 am UTC
as long as there is no parallel/set operations the file is a sequential file. if you do parallel or anything "set" oriented that implies a group or a sort of any kind, it'll be whatever order it feels like.
(rownum is useful here -- select rownum R, et.* from et)
Interesting
John Gilmore, June 29, 2005 - 2:44 am UTC
So an explicit order in external tables is supported. Do you know if this is documented anywhere? I couldn't find anything in the documentation to suggest that this was a supported feature.
June 29, 2005 - 8:55 am UTC
as long as you don't order it, group it, "set it", parallel it, it has to read the sequential file from top to bottom. I do not believe it to be documented, no.
External Table with POSITIONal and DECODE funcation
Ramasamy, July 18, 2005 - 11:09 am UTC
COntrol File STructure
LOAD DATA
TRUNCATE
INTO TABLE table1
(column1 POSITION(1:8) "LTRIM(:column1,'0')",
column2 POSITION (54:54) "DECODE:column1,'O',1,'S',3,2)" ,
...
How can i create an external table with calling fuctions for positional fields like i do as above in control files?.
Mainly the functions like DECODE , LTRIM etc.
Thanks a lot for your help.
July 18, 2005 - 11:19 am UTC
$ sqlldr u/p controlfile.name EXTERNAL_TABLE=GENERATE_ONLY
let sqlldr do the heavy lifting, the log file from that (it won't load ANYTHING) will have the external table definition.
What you'll discover is the SQL functions are used in .... SQL.
The ET doesn't have ltrim, decode, the SELECT against the ET does.
bit late, but ...
Gabe, July 18, 2005 - 11:43 am UTC
John Gilmore said ...
<quote>What I'm trying to do is read from an external table into a physical table. In the process, I want to include a pseudo-column based on a sequence so that selecting from the physical table (and ordering by the sequence column) will give the same order that is present in the external file. Will this work or is it not supported? Any suggestions for a workaround would be gratefully
accepted.
</quote>
RECNUM in the external table definition?
July 18, 2005 - 12:17 pm UTC
RECNUM is ok as long as you don't need it to be sequential like rownum would be, yes.
Using External Table
Yoav, September 13, 2005 - 3:22 am UTC
Hi Mr. Kyte
I would like to know if its possible to do the folling task using external table:
I have a directory on windows map network drive.
I need to read the content of all the files in that directory.(the the names of the files is there created date).
Each file contain one header line that should by insert to DEPT table, and the reast on the lines
should by insert to EMP table.
q1. How can i direct the first line to DEPT table and the rest of the line to EMP table using external table.
q2. As i said the name of the files is changing. Is it possible to change dynamicly the value of the location clause.
Thank you.
September 13, 2005 - 12:05 pm UTC
q1)
insert
when r = 1 then into t1 (username, user_id, created) values (username,user_id,created)
when r <>1 then into t2 (username, user_id, created) values (username,user_id,created)
select rownum r, username, user_id, created
from external_table;
q2) you can use alter table to change the file location referenced by the external table.
However, your two main hurdles will be:
a) getting the list of files (java stored procedure can do that)
b) getting Oracle the ability to see that directory, it is not easy like Unix would be.
</code>
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:241814624807 <code>
External Table
Yoav, September 14, 2005 - 4:25 am UTC
Hi Mr.Kyte
Your answer was very helpful.
I would like to ask one more question:
What if the structure of the first line in the file is different from the rest of the line in the same file ?
is its still possible to read the content of the file using external table?
In the following example, the first line represent the depratment the users belong to, and it separated
by "@" while the reset of the lines include details about the employees are separated by comma.
10@FINANCE@NEY YORK
RAN,132,01-JUL-2005
MICK,201,01-APR-2005
JOHN,228,01-MAR-2005
Thanks You!!
September 14, 2005 - 8:31 am UTC
You would probably want to use two different external tables.
One, you just query the first row.
The other, you have it SKIP the first row (you can do that in the syntax of the external table create "SKIP 1"
else, you would have to generically map all of the fields of the widest table (the one with the most columns) to varchar2's and use to_date/to_number to convert them all.
whitespace before and after in the column data
Vin, November 08, 2005 - 9:08 pm UTC
Oracle 10.2.0.1.0
With the external table as defined below with seed data as given
Seed Data Example: whitespace exist for second column data
whose lenght is 11
'whitespace"C"whitespacewhitespace"P64"whitespacewhitespace"2"whitespace"
2002, C P64 2 ,180390130
2002, EW65 2 ,180445120
create table ET_VIN
(
MYEAR varchar2(4),
MVIN varchar2(11),
MCODE varchar2(9)
)
organization external
( type oracle_loader default directory ff_mvv_dir
access parameters ( records delimited by ','
fields notrim
( MYEAR, MVIN, MCODE )
)
location ('mvv.csv')
)
reject limit unlimited;
also tried the below format without success
fields notrim
( MYEAR char(4),
MVIN char(11),
MCODE char(9)
)
Also tried using SQLLDR with the control file defined below
load data
infile 'mvv.csv'
append into table MVV
fields terminated by ","
optionally enclosed by '"'
trailing nullcols
(
MYEAR ,
MVIN char "substr(:VIN,1,11)",
MCODE
)
Is there anyway to load those "whitespace" along with other characters. The requirement was after loading this col(11), i need to split this one column of size 11 and load into 11 columns of size 1, thus loading those white space was important to get the correct position of any character in the col lenght of 11.
As always appreciate your help, in giving your valuable time in educating and helping the Oracle user community.
whitespace before and after in the column data
Vin, November 11, 2005 - 9:36 am UTC
Just so that you know, this worked for me using sqlldr.
(
MYEAR position(1:4),
MVIN position(6:16),
MCODE position(18:26)
)
I think i could using the same positioning of data when using the External Tables. Please correct me if I am wrong.
As always your sharing of these tricks and knowledge, and the valuable time you spare is immensely appreciated.
November 12, 2005 - 8:35 am UTC
take your control file and run
sqlldr u/p controlfilename external_table=generate_only
and you'll have your external table in the log file.
whitespace before and after in the column data
Vin, November 11, 2005 - 7:54 pm UTC
This worked for me while using a External Table.
create table ET_VIN
(
MYEAR varchar2(4),
MVIN varchar2(11),
MCODE varchar2(9)
)
organization external
( type oracle_loader default directory ff_mvv_dir
access parameters ( records delimited by newline
nologfile
fields terminated by ','
optionally enclosed by '"' and '"' notrim
missing field values are null
reject rows with all null fields
(MYEAR position(1:4),
MVIN position(6:16),
MCODE position(18:26)
)
)
location ('mvv.csv')
)
reject limit unlimited;
Regards
Very Simple SQLPlus Script Calls - @@ - C:\Documents and Settings\
Prince Faran, November 18, 2005 - 10:27 am UTC
Hello Tom
I have a two very simple SQLPlus script files script1 and script2 placed within same folder. One script is calling other using @@. The fully qualified names of these scripts and their codes are as under:
-- C:\Documents and Settings\Administrator\My Documents\My Scripts\script1.sql
======================start of script=====================
connect scott/tiger;
@@script2.sql
=======================end of script======================
-- C:\Documents and Settings\Administrator\My Documents\My Scripts\script2.sql
======================start of script=====================
SELECT user FROM dual;
=======================end of script======================
Here you can see what happend, when I called script1 from sqlprompt:
SQL> @C:\Documents and Settings\Administrator\My Documents\My Scripts\script1.sql
SP2-0310: unable to open file "C:\Documents.sql"
SQL> -- Sorry, I forgot to enclose the filename in double quotes.
SQL> @"C:\Documents and Settings\Administrator\My Documents\My Scripts\script1.sql"
Connected.
SP2-0310: unable to open file "C:\Documents.sql"
SQL> -- connect command in script1 worked but script1 could not call script2 within the same folder
I guess if you are using Oracle9i on Windows2003, the @@ cannot handle current path value including white spaces in it (using ordinary way). Can you guide me, what would be solution of above-mentioned scenario?
Thanks
Faran
November 18, 2005 - 3:38 pm UTC
I suppose to not use spaces in path names - you've diagnosed the issue all of the way.
spaces in path names kill so many things (not just oracle things).
RE: Problem with External table and Timestamps
Kevin, November 28, 2005 - 2:06 pm UTC
Tom:
I have a table which has some date fields in it:
Date_Table
============
FieldA DATE,
FieldB DATE
The data is coming from a MSSQL server in the format of a timestamp:
outfile.dat
------------------
2005-11-11 00:00:00.000||2003-12-10 00:00:00.000||
I setup the externally organized table as:
CREATE TABLE test_file_extern(SYS_UPDATE_DATE DATE,
DW_LOAD_DT DATE)
ORGANIZATION EXTERNAL (
TYPE ORACLE_LOADER
DEFAULT DIRECTORY export_dir
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE
FIELDS TERMINATED BY '||'
(SYS_UPDATE_DATE TIMESTAMP 'YYYY-MM-DD HH:MI:SSxFF3', DW_LOAD_DT TIMESTAMP 'YYYY-MM-DD HH:MI:SSxFF3')
)
LOCATION ('somefile.dat')
)
REJECT LIMIT 0;
The documentation for 9i CLEARLY states that TIMESTAMP is a valid input format:
"The opaque_format_spec lets you list the parameters and their values. Please refer to Oracle9i Database Utilities for information on how to specify values for the opaque_format_spec"
and
Table 6-2 Datatype Conversions for Datetime and Interval Datatypes
SQL*Loader Datatype Oracle Database Datatype (Conversion Support)
N = N (Yes), C (Yes), D (No), T (No), TS (No), YM (No), DS (No)
C = N (Yes), C (Yes), D (Yes), T (Yes), TS (Yes), YM (Yes), DS (Yes)
D = N (No), C (Yes), D (Yes), T (No), TS (Yes), YM (No), DS (No)
T = N (No), C (Yes), D (No), T (Yes), TS (Yes), YM (No), DS (No)
TS = N (No), C (Yes), D (Yes), T (Yes), TS (Yes), YM (No), DS (No)
YM = N (No), C (Yes), D (No), T (No), TS (No), YM (Yes), DS (No)
DS = N (No), C (Yes), D (No), T (No), TS (No), YM (No), DS (Yes)
and
date_format_spec ::=
DATE_FORMAT
{
{ DATE | { TIME | TIMESTAMP } [WITH TIMEZONE] } MASK \"date/time mask\"
| INTERVAL { YEAR_TO_MONTH | DAY_TO_SECOND }
}
For information on how to read this syntax, see
However, when I try to select data from the table I get the following error:
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-00554: error encountered while parsing access parameters
KUP-01005: syntax error: found "timestamp": expecting one of: "comma, char,
date, defaultif, decimal, double, float, integer, (, nullif, oracle_date,
oracle_number, position, raw, recnum, ), unsigned, varrawc, varchar, varraw,
varcharc, zoned"
KUP-01007: at line 7 column 38
ORA-06512: at "SYS.ORACLE_LOADER", line 14
ORA-06512: at line 1
For the life of me I can't figure out why... The documentation says I can, and so does tons of other examples.
I'm using the following:
Oracle Version: 9iR2 v9.2.0.6
OS Version: Solaris v5.9
Any ideas?
November 28, 2005 - 3:29 pm UTC
Nullable FKs and external tables
Sam Chowdary, November 28, 2005 - 3:11 pm UTC
We depend upon external tables a lot to import data from CSV format files. Though we could overcome many other issues, we could not over come this issue of nullable FKs.
Suppose there is master and detail tables. Detail table has master table FK as nullable. When we try to load data from external tables, we need to have some data in all FK columns - otherwise we can not import! Till now we overcame this by having dummy records in all master tables and carry dummy as data in FK columns in child tables (where ever data is not required). This approach is not looking good.
Do you have any suggestion to over come this issue?
Thanks in advance.
November 28, 2005 - 3:34 pm UTC
why cannot you import?
If the fkey columns are allowed to be NULL,set them null and away you go?
problem is not clear, do you have an example?
Nullable FKs and external tables
A reader, November 28, 2005 - 3:58 pm UTC
Problem here is with state_id. Unless I provide some dummy data in parent (state) and child (supplier -> SPLR_TAG) I am not able to load external data.
1) SUPPLIER table
SPLR_ID NOT NULL NUMBER(10)
SPLR_TAG NOT NULL VARCHAR2(50)
SPLR_NAME VARCHAR2(50)
SPLR_ADDRS VARCHAR2(100)
SPLR_CITY VARCHAR2(50)
STATE_ID NUMBER(10) -> Nullable FK to STATE parent table
SPLR_ZIP VARCHAR2(20)
CNTRY_ID NOT NULL NUMBER(10) -> FK to COUNTRY parent table
TZONE_ID NOT NULL NUMBER(10) -> FK to TIMEZONE parent table
CALN_ID NOT NULL NUMBER(10) -> FK to CALENDAR parent table
2) Script to load external data.
create table SUPPLIER_EXT
(
SPLR_TAG VARCHAR2(50),
SPLR_NAME VARCHAR2(50),
SPLR_ADDRS VARCHAR2(100),
SPLR_CITY VARCHAR2(50),
STATE_TAG VARCHAR2(50),
SPLR_ZIP VARCHAR2(20),
CNTRY_TAG VARCHAR2(50),
TZONE_TAG VARCHAR2(50),
CALN_TAG VARCHAR2(50)
)
ORGANIZATION EXTERNAL
( type oracle_loader default directory data_dir
access parameters
(
RECORDS DELIMITED BY NEWLINE
BADFILE bad_dir:'SUPPLIER.bad'
SKIP 1
LOGFILE log_dir:'SUPPLIER.log'
fields terminated by ','
MISSING FIELD VALUES ARE NULL
)
location('SUPPLIER.csv'))
REJECT LIMIT UNLIMITED;
MERGE INTO SUPPLIER
USING (
SELECT SUPPLIER_EXT.*, t271.STATE_ID "STATE_ID", t272.CNTRY_ID "CNTRY_ID", t273.TZONE_ID "TZONE_ID", t274.CALN_ID "CALN_ID"
FROM SUPPLIER_EXT, STATE t271, COUNTRY t272, TIMEZONE t273, CALENDAR t274
WHERE SUPPLIER_EXT.STATE_TAG = t271.STATE_TAG AND SUPPLIER_EXT.CNTRY_TAG = t272.CNTRY_TAG AND SUPPLIER_EXT.TZONE_TAG = t273.TZONE_TAG AND SUPPLIER_EXT.CALN_TAG = t274.CALN_TAG
) ABC ON ( SUPPLIER.SPLR_TAG = ABC.SPLR_TAG)
WHEN MATCHED THEN
UPDATE SET
SUPPLIER.SPLR_NAME = ABC.SPLR_NAME, SUPPLIER.SPLR_ADDRS = ABC.SPLR_ADDRS, SUPPLIER.SPLR_CITY = ABC.SPLR_CITY, SUPPLIER.STATE_ID = ABC.STATE_ID, SUPPLIER.SPLR_ZIP = ABC.SPLR_ZIP, SUPPLIER.CNTRY_ID = ABC.CNTRY_ID, SUPPLIER.TZONE_ID = ABC.TZONE_ID, SUPPLIER.CALN_ID = ABC.CALN_ID
WHEN NOT MATCHED THEN
INSERT (
SUPPLIER.SPLR_ID, SUPPLIER.SPLR_TAG, SUPPLIER.SPLR_NAME, SUPPLIER.SPLR_ADDRS, SUPPLIER.SPLR_CITY, SUPPLIER.STATE_ID, SUPPLIER.SPLR_ZIP, SUPPLIER.CNTRY_ID, SUPPLIER.TZONE_ID, SUPPLIER.CALN_ID
) VALUES (
SEQ_SUPPLIER.nextval, ABC.SPLR_TAG, ABC.SPLR_NAME, ABC.SPLR_ADDRS, ABC.SPLR_CITY, ABC.STATE_ID, ABC.SPLR_ZIP, ABC.CNTRY_ID, ABC.TZONE_ID, ABC.CALN_ID
);
drop table SUPPLIER_EXT;
November 28, 2005 - 11:44 pm UTC
still don't get it. why is there a problem?
can you take the example down to:
a) simple table - just one column and one FK column should be sufficient
b) supply everything needed for simple example.
If the incoming fk is null - what is the issue?
correction
A reader, November 28, 2005 - 4:01 pm UTC
PS: read SPLR_TAG as STATE_TAG
Problem here is with state_id. Unless I provide some dummy data in parent
(state) and child (supplier -> STATE_TAG) I am not able to load external data.
Issue with import
A reader, November 30, 2005 - 7:33 pm UTC
When I try to import SUPPLIER with empty state_tag, then it does not import any records. May be because of the SQL is trying to match parent record and not able to find one?
1) SUPPLIER table
SPLR_ID NOT NULL NUMBER(10)
SPLR_TAG NOT NULL VARCHAR2(50)
STATE_ID NUMBER(10) -> Nullable FK to STATE parent table
2) Script to load external data.
create table SUPPLIER_EXT
(
SPLR_TAG VARCHAR2(50),
STATE_TAG VARCHAR2(50)
)
ORGANIZATION EXTERNAL
( type oracle_loader default directory data_dir
access parameters
(
RECORDS DELIMITED BY NEWLINE
BADFILE bad_dir:'SUPPLIER.bad'
SKIP 1
LOGFILE log_dir:'SUPPLIER.log'
fields terminated by ','
MISSING FIELD VALUES ARE NULL
)
location('SUPPLIER.csv'))
REJECT LIMIT UNLIMITED;
MERGE INTO SUPPLIER
USING (
SELECT SUPPLIER_EXT.*, t271.STATE_ID "STATE_ID",
FROM SUPPLIER_EXT, STATE t271
WHERE SUPPLIER_EXT.STATE_TAG = t271.STATE_TAG ) ABC ON ( SUPPLIER.SPLR_TAG = ABC.SPLR_TAG)
WHEN MATCHED THEN
UPDATE SET ABC.STATE_ID
WHEN NOT MATCHED THEN
INSERT (
SUPPLIER.SPLR_ID, SUPPLIER.SPLR_TAG, SUPPLIER.STATE_ID
) VALUES (
SEQ_SUPPLIER.nextval, ABC.SPLR_TAG, ABC.STATE_ID
);
drop table SUPPLIER_EXT;
November 30, 2005 - 9:14 pm UTC
I need a full example, in the fashion I provide to you on this site - something anyone could run.
Problem loading data from a file with variable length
Knut Billerbeck, January 12, 2006 - 9:13 am UTC
Hi Tom,
I have a file with the following contense:
0028This is the first record0029This is the second record
I would like to get the contense of such a file into an external table. The Problem is when I declare the record with a varible length, let me say 'RECORDS VARIABLE 4', then the loader expects the 4 bytes to be excluded from the number of bytes to be readed (means the loader would read 28 bytes after the 0028) but I have them included (means the loader should read only 28-4=24 bytes after the 0028). How can I do this?
Thanks in advance
Knut
January 12, 2006 - 11:04 am UTC
not sure that you can
Alex, March 22, 2006 - 2:44 pm UTC
Hi Tom,
Sorry to bother you with such an easy question, but I cannot find the answer anywhere. All I want to do is specify a default value for a column in my external table.
So say I'm loading 3 columns, but only two have values in my input file, I want to use the same value for all the rows for this third column.
I received an error when trying to use the default clause on the column in my external table, as if it were a regular table. I hope that makes sense.
March 22, 2006 - 5:14 pm UTC
define "default value" in the context of an external table?
do you mean "if this field is null return this"
do you mean "add this field with this constant value"
Either of which is easily done via a view:
create view et_view
as
select et.*, nvl(that_column,'x') case_1,
'y' case_2
from et;
and use the view, not the external table.
Alex, March 22, 2006 - 5:43 pm UTC
Ok what I mean is, let's say the data in my file has the value for two columns, delimited by '|', e.g.
Joe Stevens | 603-321-1234
But the Oracle table I will copying the external table data into has three columns: name, number, and data_tag. The data_tag column is something I will be using to identify the source of the data. So, other than copying and pasting in textpad over and over for every row "| data_load", can I default that value to be loaded for the data_tag column?
Hope that is more clear, thank you.
March 22, 2006 - 7:13 pm UTC
did you see my answer above? "create view" and done.
or just select the constant in your query.
EXTENAL TABLE - Selects.
VIKAS, April 03, 2006 - 3:31 am UTC
Dear Mr. Kyte
It is required by our application to select some hardcoded values from a Flat Notepad Text File. For this i thought of using External tables. I've tried creating the Tables as :-
create or replace directory EXTTAB '\\vikas\E:\Shared\EXTTAB';
Directory created
Subsequently, In the above created Direcory, I've created a Text file "XtrLeads.txt"
(Having values
32, 001,
32, 002,
32, 003, ... and so on)
This file would be used hold the above values to be Pick/Selected by the External table below.
Then,
SQL> create table EXTLEADS
2 (
3 OID number(5),
4 LID NUMBER(8)
5 )
6 organization external
7 (
8 type oracle_loader
9 default directory EXTTAB
10 access parameters
11 (
12 fields terminated by ','
13 records delimited by newline
14 nobadfile
15 nodiscardfile
16 nologfile
17 )
18 location('XtrLeads.txt')
19 )
20 reject limit unlimited
21 /
Table created
But now, when I
Select * from EXTLEADS;
I get follwing messages...
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-00554: error encountered while parsing access parameters
KUP-01005: syntax error: found "records": expecting one of: "enclosed, exit, (, ltrim, lrtrim, ldrtrim, missing, notrim, optionally, rtrim, reject"
KUP-01007: at line 2 column 5
ORA-06512: at "SYS.ORACLE_LOADER", line 14
ORA-06512: at line 1
Here, I have my Oracle atabase Server on Fedora Core 4 Linux, My client machine is Windows XP (where I am also creating the above Directory)and Lastly, I have not configured the UTL_FILE_DIR paramater(its Default, empty).
Please tell me where am I going wrong? and how can I do it right and achive the desired results?
Take care, regards...
April 03, 2006 - 8:12 am UTC
you do know that windows is not like Unix - it is not a multi-user OS really. That share - not like a mount point in Unix - it likely will NOT be visible to the server at all - the file system you are trying to read just wont be there. Getting a windows service able to see a filesystem is "not trivial"
access parameters
(
records delimited by newline
nobadfile
nodiscardfile
nologfile
fields terminated by ','
)
location('XtrLeads.txt')
)
EXTERNAL TABLE - Selects
VIKAS, April 04, 2006 - 2:12 am UTC
Dear Mr. Kyte,
Thanx for the reply to the above subjected post.
Based on above reply, I would like to know from you that, Then how can we achieve this on windows evironment?
I mean how to create and make use External Tables in a client server netweork where Database server is Linux based and client machine is Windows XP and a user wants to make use of his Windows client machine to create External Tables and make use of them.
Also, When ever I try to create the directory on my client machine(Windows XP), the creation is successful, but, when I try to physically check the existence of my diretory it is not there. Where actually is the directory being created? is it created into the database server? if not then where?
Why is this command successful?
SQL > create directory dir1 as 'C:\dir1'
What workarounds should one try in such conditions?
Take care, Regards...
April 04, 2006 - 9:54 am UTC
if the database server is Linux, you can MOUNT file systems on linux and linux will be able to see it.
it would be if the server was WINDOWS - and since you used a UNC name in your directory.... it would appear that it was.
SELECT COUNT(*) VS SELECT * ON EXTERNAL TABLES ...
VKOUL, April 04, 2006 - 1:59 pm UTC
SQL>
SQL> select * from iislog_web1;
no rows selected
SQL> select count(*) from iislog_web1;
COUNT(*)
----------
44
SQL>
SQL>
SQL> !file iislog_web1.dat
iislog_web1.dat: ASCII text, with CRLF line terminators
SQL>
Q1 : This file contains some unwanted characters at the end of each line, which makes the external table to reject all records, why COUNT(*) shows 44 records VS * ?
Q2 : Is there a way in an external table to clean & load file without running a UNIX command like "tr -cd '\11\12\40-\176' < input_file > output_file"
Thanks
April 04, 2006 - 7:32 pm UTC
and how did you create the external table.
SELECT COUNT(*) VS SELECT * ON EXTERNAL TABLES ...
VKOUL, April 05, 2006 - 3:17 pm UTC
CREATE TABLE iislog_web1
(
"date" VARCHAR2(10),
"time" VARCHAR2(8),
"s-ip" VARCHAR2(15),
"cs-method" VARCHAR2(4),
"cs-uri-stem" VARCHAR2(500),
"cs-uri-query" VARCHAR2(500),
"c-ip" VARCHAR2(15),
"sc-status" NUMBER,
"sc-bytes" NUMBER,
"cs-bytes" NUMBER,
"time-taken" NUMBER
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY report_dat_dir
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE
BADFILE report_bad_dir:'iislog_web1.bad'
DISCARDFILE report_bad_dir:'iislog_web1.dis'
LOGFILE report_log_dir:'iislog_web1.log'
SKIP 4
FIELDS TERMINATED BY ' '
REJECT ROWS WITH ALL NULL FIELDS
(
"date",
"time",
"s-ip",
"cs-method",
"cs-uri-stem",
"cs-uri-query",
"c-ip",
"sc-status",
"sc-bytes",
"cs-bytes",
"time-taken"
)
)
LOCATION ('iislog_web1.dat')
)
REJECT LIMIT UNLIMITED
/
April 06, 2006 - 9:37 am UTC
I assume you are in 9i - in 10g you can control this and by default, in 10g the following script:
create or replace directory report_bad_dir as '/tmp/'
/
create or replace directory report_log_dir as '/tmp/'
/
create or replace directory report_dat_dir as '/tmp/'
/
!echo a b c d e f g 1 2 3 4 > /tmp/iislog_web1.dat
!echo a b c d e f g 1 2 3 4 >> /tmp/iislog_web1.dat
!echo a b c d e f g 1 2 3 4 >> /tmp/iislog_web1.dat
!echo a b c d e f g 1 2 3 4 >> /tmp/iislog_web1.dat
!echo a b c d e f g 1 2 3 4 >> /tmp/iislog_web1.dat
!echo a b c d e f g 1 2 3 a >> /tmp/iislog_web1.dat
drop TABLE iislog_web1;
CREATE TABLE iislog_web1
(
...... your create here .....
/
select * from iislog_web1;
select count(*) from iislog_web1;
select "date", "time", "s-ip" from iislog_web1;
select "date", "time", "s-ip", "time-taken" from iislog_web1;
produces:
ops$tkyte@ORA10GR2> select * from iislog_web1;
date time s-ip cs-m
---------- -------- --------------- ----
cs-uri-stem
-------------------------------------------------------------------------------
cs-uri-query
-------------------------------------------------------------------------------
c-ip sc-status sc-bytes cs-bytes time-taken
--------------- ---------- ---------- ---------- ----------
a b c d
e
f
g 1 2 3 4
ops$tkyte@ORA10GR2> select count(*) from iislog_web1;
COUNT(*)
----------
1
ops$tkyte@ORA10GR2> select "date", "time", "s-ip" from iislog_web1;
date time s-ip
---------- -------- ---------------
a b c
ops$tkyte@ORA10GR2> select "date", "time", "s-ip", "time-taken" from iislog_web1;
date time s-ip time-taken
---------- -------- --------------- ----------
a b c 4
but in 9i, it'll:
ops$tkyte@ORA9IR2> select * from iislog_web1;
date time s-ip cs-m
---------- -------- --------------- ----
cs-uri-stem
-------------------------------------------------------------------------------
cs-uri-query
-------------------------------------------------------------------------------
c-ip sc-status sc-bytes cs-bytes time-taken
--------------- ---------- ---------- ---------- ----------
a b c d
e
f
g 1 2 3 4
ops$tkyte@ORA9IR2> select count(*) from iislog_web1;
COUNT(*)
----------
2
ops$tkyte@ORA9IR2> select "date", "time", "s-ip" from iislog_web1;
date time s-ip
---------- -------- ---------------
a b c
a b c
ops$tkyte@ORA9IR2> select "date", "time", "s-ip", "time-taken" from iislog_web1;
date time s-ip time-taken
---------- -------- --------------- ----------
a b c 4
in 9i, it only parses the fields you select - resulting in potentially different record counts if you select (or not) a field that causes a record to be rejected...
One solution would be to map each field as a varchar2(4000) instead and perform the validation/translation in the select itself.
select substr( f1, 1, 20 ) f1,
...
to_number( substr( translate(f10, ... ) , ..... ) )
from et
where <any condition to filter bad data>
SELECT COUNT(*) VS SELECT * ON EXTERNAL TABLES ...
VKOUL, April 06, 2006 - 1:49 pm UTC
You are right Tom,
I am using 9i here, but one thing I do not understand, since there are 6 records, why it only shows 2 in COUNT(*)?
$ cat iislog_web1.dat | wc -l
6
$
The records which are replicated show only once, I wonder !!
The showing of COUNT(*) it seems to me is, it shows actual number of records in the file. But when you select specific column/(s), it loads only those which actually is kind of cool (I don't think we can do that in SQL Loader that easily) as it shows if the COUNT(*) and that particular column selected produces the same number of records, then the column data is free from errors. Which means just by selecting a column or a group of columns you can determine which column data is errornous.
In 10g, as you showed it gives only the successful load of records in external table. Hmm...
April 07, 2006 - 8:18 pm UTC
in 10g, you get to "pick" the behaviour - there is an option that lets the parser accept any record as long as the SELECTED columns parse OK.
SELECT COUNT(*) VS SELECT * ON EXTERNAL TABLES ...
VKOUL, April 06, 2006 - 1:55 pm UTC
For showing less number of records, I didn't realize there was a SKIP 4 in the definition.
I looked here, but it didn't answer my question
DW, April 12, 2006 - 7:28 pm UTC
I have a question about utl_file.fgetattr, but I'll take your "fair warning" to heart and I won't ask it here--it isn't directly related to the original post. I've been using this site regularly for three years and I'm a huge fan. I usually find great answers to my questions in pre-existing threads, but I'm out of luck if I can't find an answer. On at least one occasion my question was a direct follow-on to a particular thread, so I was able to ask it in the review without guilt, but not today. In three years I have never ever seen anything but "Sorry, I have a large backlog..." on the home page. You're a popular guy, Tom.
records delimited by newline option in external tables
PADMAJA, May 31, 2006 - 11:25 am UTC
Hello tom,
I have a text file on Unix box. And I created an external table as follows.
create table xt_tablenm (
-- place oracle field definitions here
--------------------------------------
field1 VARCHAR2(20),
field2 VARCHAR2(255),
field3 VARCHAR2(4),
field4 VARCHAR2(4),
field5 VARCHAR2(255),
field6 VARCHAR2(10),
field7 varchar2(8)
--------------------------------------
) organization external
( type oracle_loader default directory defdir access parameters
( records delimited by newline
badfile ext_bad: filenm.bad
logfile ext_log: filenm.log
fields terminated by '|' optionally enclosed by '"' rtrim missing
field values are null (
-- place physical field definitions here
----------------------------------------
field1,
field2,
field3,
field4,
field5,
field6,
field7
----------------------------------------
)) location ( defdir : filenm.txt))
reject limit 50 nologging nocache parallel ( degree 8 instances 1 );
And the field5 holds comments data, which has newline inside it. So because of
'records delimited by newline' oracle is terminating the comments data to
second record where ever newline occurs.
Example data:
"1111"|"field2 data"|"0 "|"1 "|"field5 data comes here.
And has few new lines in this data. "|"23122006"|"adf1"|
"1112"|"field2 data"|"2 "|"2 "|"field5 data comes here.
And has few new lines in this data. "|"23122006"|"adf2"|
in "field5 data comes here.
And has few new lines in this data. " data there is a newline so after new line rest of the comments go to socond record and i get error there that data is no correct.
Can you please sujest that there is any option i can use to no to do this
Thanks in Advance
Padmaja
May 31, 2006 - 3:21 pm UTC
You cannot do this unless you have a well defined record TERMINATOR.
Is is assured that your logical record ends with |\n ??
Every record ends with |<newline>?
And -you'll never ever have:
"1111"|"field2 data"|"0 "|"1 "|"field5 data comes here with a pipe!.|
And has few new lines in this data. "|"23122006"|"adf1"|
records delimited by newline option in external tables
PADMAJA, June 06, 2006 - 1:42 pm UTC
Hi Tom,
Thank you very much for your response.
Yes records always ends with |newline. But because I specified in the program that new line is the record delimiter. Program is reading after newline in comments field data as new record.
The tool we use to extract this text file is datastage. And now I came to know that we could scrub this new line from comments filed. Then now it is loading data properly. But I want to know how we can fix it in oracle side.
Thank you
Padmaja
June 06, 2006 - 2:07 pm UTC
use:
ACCESS PARAMETERS
(
RECORDS DELIMITED BY 0x'7C0A' ....
that is delimited by |\n
records delimited by newline option in external tables
PADMAJA, June 13, 2006 - 10:31 am UTC
Hi Tom,
Thank you again for your excellent answer. I didnt test it yet. But I will. How I will know that 0x'7C0A' is for |/n? And if there is a character other than |/n, how I can convert into the format you gave for |/n?
Thank you very much,
padmaja
June 13, 2006 - 12:27 pm UTC
ops$tkyte@ORA9IR2> select to_char(ascii('|'),'xx') from dual;
TO_
---
7c
Elapsed: 00:00:00.01
ops$tkyte@ORA9IR2> select to_char(ascii('
2 '),'xx') from dual;
TO_
---
a
Elapsed: 00:00:00.00
records delimited by newline option in external tables
PADMAJA, June 13, 2006 - 3:49 pm UTC
Hi Tom,
THank you for that code. That is very helpfull.
Padmaja.
Using Merge while loading from External Tables
Kumar, July 16, 2006 - 1:30 pm UTC
Tom,
Can you please look into the folowing merge statement. I am trying to insert into a blank table from an external table while simultaneously doing updates on the data.
The external table data is:
ab/10/0/
ab/10/0/
ab/10/0/
bc/20/0/
cd/10/0/
cd/10/0/
de/10/0/
fg/10/0/
Ext table created is 'T_EXT' that is successfuly created.
My source table is: ( g is the primary key)
g h
-------------------
ab 20
bc 20
cd 30
de 20
ef 20
The target table 't' after the merge should look like:
a b c
-------------------------
ab 10 3
ab 10 3
ab 10 3
bc 20 1
cd 10 3
cd 10 3
de 10 3
ef 20 2
fg 10 4
My merge is:
merge into T
using
(SELECT
te.A A1,
te.B,
Z.G,
Z.H,
(CASE
WHEN Z.B = Z.h
THEN 1
WHEN (Z.B > Z.h OR Z.B < Z.h)
THEN 3
ELSE 4
END ) tag
from
(
SELECT TEX.A,TEX.B,S.G,S.H FROM
(select a,c,sum(b) b from T_EXT group by a,c) TEX
FULL OUTER JOIN S
on (TEX.a=s.g )
)z,t_EXT TE WHERE Z.A= TE.A
) SR
on (t.a = sr.a1)
when matched then update set t.C = sR.tag
when not matched then
insert
(T.A,T.B,T.C)
values
(NVL(sr.a1,sr.g),NVL(sR.b,sr.h),sr.tag);
The above statement actually produces in table 't':
a b c
-------------------------
ab 10 3
ab 10 3
ab 10 3
bc 20 1
cd 10 3
cd 10 3
de 10 3
fg 10 4
a) I am not able to see the nvl values ie., 'ef' from table 's'.
b) Is it required for me to use the T_EXT twice in the statement (the table has millions of records) one for doing the grouping and the other to get the individual values?
c) I get the same result above even using LEFT OUTER JOIN instead of FULL OUTER JOIN. Why is this?
July 16, 2006 - 4:25 pm UTC
interesting - you lost me early on with "my external table is this", "my source table is that", "my target should look like this afterwards"
All we need is
a) source table (the create table command AND the inserts to populate it)
b) target table (the create and INITIAL inserts - sort of nice to know what it started from)
c) the specification for the desired outcome
external table seems not relevant here.
couple of create tables, requisite inserts into the tables - that is necessary here. Something we can cut and paste and run
Sorry ..here are the create table statement
Kumar, July 16, 2006 - 1:37 pm UTC
For my above request:
create table t (a varchar2(2),b number, c number);
create table s (g varchar2(2), h number);
insert into s values('ab',20);
insert into s values('bc',20);
insert into s values('cd',30);
insert into s values('de',20);
insert into s values('ef',20);
create table t_EXT (a varchar2(2),b number,c number)
ORGANIZATION EXTERNAL
( type oracle_loader
default directory TEMP
access parameters
( fields terminated by '/'
OPTIONALLY ENCLOSED BY '"' and '"'
LRTRIM(A,B,C))
location ('TESTLOAD.TXT'))
July 16, 2006 - 4:46 pm UTC
no external tables, I don't have your data, not relevant to the problem at all.
just normal tables will do - since at runtime you cannot tell the difference between a regular table and an external table!
So, table T is empty - the target is empty.
After I run your merge, I see in T the values of the SELECT in the using. That is the only thing I "should" see.
And I see them all.
Now what? I'm missing something - it did exactly what you coded and I'm not sure what you were expecting exactly.
ops$tkyte%ORA10GR2> create table t_ext( a varchar2(2), b number, c number );
Table created.
ops$tkyte%ORA10GR2> insert into t_ext values ( 'ab',10,0);
1 row created.
ops$tkyte%ORA10GR2> insert into t_ext values ( 'ab',10,0);
1 row created.
ops$tkyte%ORA10GR2> insert into t_ext values ( 'ab',10,0);
1 row created.
ops$tkyte%ORA10GR2> insert into t_ext values ( 'bc',20,0);
1 row created.
ops$tkyte%ORA10GR2> insert into t_ext values ( 'cd',10,0);
1 row created.
ops$tkyte%ORA10GR2> insert into t_ext values ( 'cd',10,0);
1 row created.
ops$tkyte%ORA10GR2> insert into t_ext values ( 'de',10,0);
1 row created.
ops$tkyte%ORA10GR2> insert into t_ext values ( 'fg',10,0);
1 row created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> SELECT
2 te.A A1,
3 te.B,
4 Z.G,
5 Z.H,
6 (CASE
7 WHEN Z.B = Z.h
8 THEN 1
9 WHEN (Z.B > Z.h OR Z.B < Z.h)
10 THEN 3
11 ELSE 4
12 END ) tag
13 from
14 (
15 SELECT TEX.A,TEX.B,S.G,S.H FROM
16 (select a,c,sum(b) b from T_EXT group by a,c) TEX
17 FULL OUTER JOIN S
18 on (TEX.a=s.g )
19 )z,t_EXT TE WHERE Z.A= TE.A
20 /
A1 B G H TAG
-- ---------- -- ---------- ----------
ab 10 ab 20 3
ab 10 ab 20 3
ab 10 ab 20 3
bc 20 bc 20 1
cd 10 cd 30 3
cd 10 cd 30 3
de 10 de 20 3
fg 10 4
8 rows selected.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> merge into T
2 using
3 (SELECT
4 te.A A1,
5 te.B,
6 Z.G,
7 Z.H,
8 (CASE
9 WHEN Z.B = Z.h
10 THEN 1
11 WHEN (Z.B > Z.h OR Z.B < Z.h)
12 THEN 3
13 ELSE 4
14 END ) tag
15 from
16 (
17 SELECT TEX.A,TEX.B,S.G,S.H FROM
18 (select a,c,sum(b) b from T_EXT group by a,c) TEX
19 FULL OUTER JOIN S
20 on (TEX.a=s.g )
21 )z,t_EXT TE WHERE Z.A= TE.A
22 ) SR
23 on (t.a = sr.a1)
24 when matched then update set t.C = sR.tag
25 when not matched then
26 insert
27 (T.A,T.B,T.C)
28 values
29 (NVL(sr.a1,sr.g),NVL(sR.b,sr.h),sr.tag);
8 rows merged.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from t;
A B C
-- ---------- ----------
ab 10 3
ab 10 3
ab 10 3
bc 20 1
cd 10 3
cd 10 3
de 10 3
fg 10 4
8 rows selected.
Kumar, July 16, 2006 - 6:00 pm UTC
Thanks for such a quick reply Tom. I was expecting to see in the target table 't' after the merge:
a b c
-------------------------
ab 10 3
ab 10 3
ab 10 3
bc 20 1
cd 10 3
cd 10 3
de 10 3
ef 20 2 -- This is not there in T_EXT butis there
in table 'S'.
fg 10 4
And I see you have run the merge statement I have used. That is not producing the row with value 'ef' from table S.
You said External Table is not relevant here. I am hoping to use external table for data in the flat file and then merge into 'T' using source 'S'.
Also as you observed I have used T_EXT twice in the merge. Is that required for the result I need. Somehow I was trying to see that I use this T_EXT only once. But have been unsuccessful.
July 17, 2006 - 1:03 pm UTC
maybe if you explain what your desired "end result" is (heck, skip the MERGE - the SELECT is the problem here) we can look. Problem is, you have given us a "query that doesn't work" but no real explaination of the goal -the specification.
my comment about the external table is simply "it doesn't need to be an external table for purposes of this problem - it just needs to be a table, hence the example should be SO MUCH SIMPLIER, smaller, more concise, less noise, fewer things to distract from the problem at hand"
We don't need the merge (not relevant to your problem, the problem is your select doesn't return what you want)
We don't need the external table (not relevant again, a plain old simple table does the job)
Maybe you can work with this query to get what you need - I don't see how B would be "20" since B is the sum from T_EXT and there is no EF to sum up in that table - nor how the tag would be 2 since B is missing:
ops$tkyte@ORA10GR2> select coalesce(tex.a,s.g), tex.b, s.h,
2 (CASE WHEN tex.B = s.h THEN 1
3 WHEN (tex.B > s.h OR tex.B < s.h) THEN 3
4 ELSE 4
5 END ) tag
6 from
7 (
8 select a, c, sum(b) over (partition by a,c) b
9 from t_ext
10 ) tex
11 full outer join
12 (
13 select g, h
14 from s
15 ) s
16 on (tex.a = s.g)
17 /
CO B H TAG
-- ---------- ---------- ----------
ab 30 20 3
ab 30 20 3
ab 30 20 3
bc 20 20 1
cd 20 30 3
cd 20 30 3
de 10 20 3
fg 10 4
ef 20 4
9 rows selected.
External table and outer join
Lise, July 25, 2006 - 8:49 am UTC
Back to the initial question in this thread, I find that when I outer join a table with an external table, and there are no matches in the external table, the last row in the flat file is always returned.
This is incorrect since the fields in the outer join do not match.
Say I select 1 row from the database table. There are no matching rows in the external table, but it still returns the last row in the flat file.
How can I get around this?
July 25, 2006 - 11:48 am UTC
exmaples always help - I cannot reproduce. show us how to set this up step by step by step.
ops$tkyte%ORA10GR2> create or replace directory data_dir as '/tmp/'
2 /
Directory created.
ops$tkyte%ORA10GR2> create table external_table
2 (EMPNO NUMBER(4) ,
3 ENAME VARCHAR2(10),
4 JOB VARCHAR2(9),
5 MGR NUMBER(4),
6 HIREDATE DATE,
7 SAL NUMBER(7, 2),
8 COMM NUMBER(7, 2) ,
9 DEPTNO NUMBER(2)
10 )
11 ORGANIZATION EXTERNAL
12 ( type oracle_loader
13 default directory data_dir
14 access parameters
15 ( fields terminated by ',' )
16 location ('emp.dat')
17 )
18 /
Table created.
ops$tkyte%ORA10GR2> host flat scott/tiger emp > /tmp/emp.dat
ops$tkyte%ORA10GR2> set linesize 105
ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2> select * from external_table
2 /
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
14 rows selected.
ops$tkyte%ORA10GR2> create table t ( x number );
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select t.x, et.empno from t left outer join external_table et on (t.x = et.empno);
no rows selected
ops$tkyte%ORA10GR2> select t.x, et.empno from t right outer join external_table et on (t.x = et.empno);
X EMPNO
---------- ----------
7788
7369
7499
7521
7566
7654
7698
7782
7839
7844
7876
7900
7902
7934
14 rows selected.
this is what I mean...
Lise, July 26, 2006 - 8:32 am UTC
alter table t add constraint PK_1 primary key (x)
using index;
insert into table t values(1);
SQL> SELECT * FROM external_table;
EMPNO
-----
2
3
4
5
0
SQL> select * from t;
X N
---------- --
99999999
2
1
SQL>
SQL> select t.x, et.empno from t left outer join external_table
2 et on (t.x = et.empno) where t.x = 1;
X EMPNO
---------- -----
1 0
SQL>
SQL> alter table T
2 disable constraint PK1;
Table altered
SQL>
SQL> select t.x, et.empno from t left outer join external_table
2 et on (t.x = et.empno)
3 WHERE t.x = 1
4 ;
X EMPNO
---------- -----
1
It seems that if you add a unique key on the database table, and there are no matching entries in the external table, it will always return the last row in the file.
If this is the case, how can I get round this?
July 26, 2006 - 11:29 am UTC
that would be a bug. and it is not one that is reproducing on 10gr2 for me.
Please contact support for that one.
Thanks
Lise, July 27, 2006 - 3:42 am UTC
Thanks Tom. I thought I was going mad, but there is always a logical explanation to everything.
a student, October 16, 2006 - 11:03 pm UTC
Sorry, I don't know how to ask a new question. I have a question like:
I have about 30 tables for realtime data,and the data records grow rapidly. I have to dump the data of these tables periodicaly in PL/SQL. I want to create range partition tables on date column, and put the partitions of same date in the same tablespace. Then I dump the patitions through transporting the tablespace. But I found it doesn't work.
Could you advise me?
October 17, 2006 - 4:21 am UTC
what do you mean you "found it doesn't work"
You can
a) range partition - definitely
b) put the partitions where ever you like - definitely
c) alter the table and exchange the partition with an empty table - making it so you can transport it
Regarding Limitation on File Size
Sourabh, December 13, 2006 - 9:05 am UTC
Hi Tom,
The following article states that in Oracle 9i External table cannot access file of size greater that 2GB
</code>
http://www.rampant-books.com/art_more_ext_tables.htm <code>
I was not able to find whether the same is true on Oracle 10g Database.Could you please let me know whether the same is true on Oracle 10g Database or is it some Operating System file size issue.
Thanks in Advance.
A hard nut to crack
a reader, December 19, 2006 - 4:59 am UTC
Hi,Tom:
I've encountered a difficult problem.
I've create a new database oradb,the Chinese characters are saved and displayed welled like this.
CREATE TABLE AA VARCHAR2(10), B VARCHAR2(10), C DATE) ;
insert into aa values('中中','aa','2006-12-10');
select * from aa;
A B C
---------- ---------- -------------------
中中 aa 2006-12-10 00:00:00
But after doing some exercises of backup and restore,recover, I couldn't save and display the chinese character rightly.
I didn't remember what I have done . I just remembered these:
1)I've done an exercise of rebuilding control file
2)I used RMAN to backup database and restore the database.
Now , the problem is :
CREATE TABLE AA VARCHAR2(10), B VARCHAR2(10), C DATE) ;
insert into aa values('中中','aa','2006-12-18');
Now select data in sqlplus
select * from aa;
A B C
---------- ---------- -------------------
VPVP aa 2006-12-18 00:00:00
select data in OEM's data editor
A B C
---------- ---------- -------------------
VPVP aa 18-12月-2006 12:00:00AM
I've checked the NLS parameters in NLS_DATABASE_PARAMETERS like this:
select * from nls_database_parameters
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CHARACTERSET ZHS16GBK
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_RDBMS_VERSION 9.0.1.1.1
select * from v$nls_parameters
NLS_LANGUAGE SIMPLIFIED CHINESE
NLS_TERRITORY CHINA
NLS_CURRENCY RMB
NLS_ISO_CURRENCY CHINA
NLS_NUMERIC_CHARACTERS .,
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT YYYY-MM-DD HH24:MI:SS
NLS_DATE_LANGUAGE SIMPLIFIED CHINESE
NLS_CHARACTERSET ZHS16GBK
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY RMB
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
I've also checked window's regist table,the NLS_LANG is :
NLS_LANG=simplified chinese_china.zhs16gbk
In order to find the problem, I've also imported an DMP file from another databse,
There is a table 'YHB' in the dmp file,and the data display is correct.
like this:
imp jhw/306@oradb tables=(YHB) file=c:\yhb.dmp;
But when I select data in YHB that I just import,
I found the data is not correct
XM SM
--------- -----------------
V50`T11 3,<69\@mT1
and the right data should be
XM SM
--------- -----------------
值班员1 超级管理员
Then the questions are:
1 why couldn't I save and display the chinese characters correctly?
2 In NLS_database_parameters and v$nls_parameters , some parameters' value like 'NLS_LANGUAGE' are not the same? why?
Thanks!!
How to specify the date format with time zone as date mask in External Tables?
Gowtham Sen, April 18, 2007 - 10:22 am UTC
Hi Tom,
I have a scenario, where I am getting a date filed with time zone.
I created the external table as following
CREATE TABLE "Test_EXTBL"
(
"Test_dt" Date
)
ORGANIZATION EXTERNAL (
TYPE ORACLE_LOADER
DEFAULT DIRECTORY TEST_DIR
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
CHARACTERSET WE8MSWIN1252
STRING SIZES ARE IN BYTES
BADFILE TEST_DIR:'bad_test.txt'
DISCARDFILE TEST_DIR:'discard_test.txt'
LOGFILE TEST_DIR:'log_test.txt'
FIELDS
TERMINATED BY '|'
OPTIONALLY ENCLOSED BY '"' AND '"'
LRTRIM
MISSING FIELD VALUES ARE NULL
(
"Test_dt" CHAR(30) DATE_FORMAT DATE MASK 'RRMMDDHH24MISS'
)
)
LOCATION (
TEST_DIR:'test.txt'
)
)
REJECT LIMIT UNLIMITED
NOPARALLEL
;
The data in the file test.txt is as follows.
061109235500
061109235501
061109235502
061109235503
061110235504
061110235505
SQL> SELECT * FROM "Test_EXTBL";
Test_dt
-----------
11/9/2006 1
11/9/2006 1
11/9/2006 1
11/9/2006 1
11/10/2006
11/10/2006
6 rows selected
SQL>
Now its working fine.
But I am getting the data in the file as follows
061109235500CST
061109235501CST
061109235502CST
061109235503CST
061110235504CST
061110235505CST
Here CST is time zone.
Here I would like to set the date mask for this data and if it fails processing I mean, if there is wrong data, it should go to the log.
for example say, if I get one row value as 0611. This record should go to the log.
Then I created the external table in the following way.
drop table "Test_EXTBL";
CREATE TABLE "Test_EXTBL"
(
"Test_dt" Date
)
ORGANIZATION EXTERNAL (
TYPE ORACLE_LOADER
DEFAULT DIRECTORY TEST_DIR
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
CHARACTERSET WE8MSWIN1252
STRING SIZES ARE IN BYTES
BADFILE TEST_DIR:'bad_test.txt'
DISCARDFILE TEST_DIR:'discard_test.txt'
LOGFILE TEST_DIR:'log_test.txt'
FIELDS
TERMINATED BY '|'
OPTIONALLY ENCLOSED BY '"' AND '"'
LRTRIM
MISSING FIELD VALUES ARE NULL
(
"Test_dt" CHAR(30) DATE_FORMAT DATE MASK 'RRMMDDHH24MISSTZR'
)
)
LOCATION (
TEST_DIR:'test.txt'
)
)
REJECT LIMIT UNLIMITED
NOPARALLEL
;
External table created. But while querying the data, it is showing the error "Invalid date mask is specified".
I am using Oracle 10G
Does oracle accept the date format with timezone in the external tables datemasking?
Please suggest me to implement this.
Thanks in Advance.
Thank you,
Regards,
Gowtham Sen.
April 18, 2007 - 12:40 pm UTC
it would be a timestamp, not a date
you would use a timestamp mask
DATE_FORMAT timestamp MASK
but, what does to_char(systimestamp,'tzr') return on your system?
Still it is showing the error "Invalid date mask"
Gowtham Sen, April 19, 2007 - 2:51 am UTC
Thanks Tom.
I tried the one for "to_char(systimestamp,'tzr')" on my system.
The output is as follows.
SQL> select to_char(systimestamp,'tzr') from dual;
TO_CHAR(SYSTIMESTAMP,'TZR')
--------------------------------
+05:30
I tried with the option for date mask as specifying "timestamp" in the following way.
drop table "Test_EXTBL";
CREATE TABLE "Test_EXTBL"
(
"Test_dt" Date
)
ORGANIZATION EXTERNAL (
TYPE ORACLE_LOADER
DEFAULT DIRECTORY TEST_DIR
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
CHARACTERSET WE8MSWIN1252
STRING SIZES ARE IN BYTES
BADFILE TEST_DIR:'bad_test.txt'
DISCARDFILE TEST_DIR:'discard_test.txt'
LOGFILE TEST_DIR:'log_test.txt'
FIELDS
TERMINATED BY '|'
OPTIONALLY ENCLOSED BY '"' AND '"'
LRTRIM
MISSING FIELD VALUES ARE NULL
(
"Test_dt" CHAR(30) DATE_FORMAT TIMESTAMP MASK 'RRMMDDHH24MISSTZR'
)
)
LOCATION (
TEST_DIR:'test.txt'
)
)
REJECT LIMIT UNLIMITED
NOPARALLEL
;
Even though, it is showing the same error as "Invalid date mask".
Then, I tried with the following option
"Test_dt" CHAR(30) DATE_FORMAT TIMESTAMP WITH TIMEZONE MASK 'RRMMDDHH24MISSTZR'
It works successfully!!
Thanks a lot.
You put me in the right direction to think.
Thank you,
Regards,
Gowtham Sen
ORA-01882: timezone region not found
Gowtham Sen, April 22, 2007 - 7:24 am UTC
I am facing one more problem while converting date field with timezone.
Intitally I am getting the data in the file as follows
061109235500CST
061109235501CST
061109235502CST
061109235503CST
061110235504CST
061110235505CST
The timezone masking is working fine.
But in some records I am getting data as follows.
061109235500CDT
061109235501CDT
061109235502CDT
061109235503CDT
061110235504CDT
061110235505CDT
Then its showing an error "Invalid date/time conversion".
Then I verified this conversion at sql command prompt as follows.
SQL> select to_timestamp_tz('061109235500CST', 'RRMMDDHH24MISSTZR') from dual;
TO_TIMESTAMP_TZ('061109235500C
------------------------------
09-NOV-06 11.
SQL>
Its working fine with timezone CST.
But I specify the time zone as CDT I got the error as follows.
SQL> select to_timestamp_tz('061109235500CDT', 'RRMMDDHH24MISSTZR') from dual;
select to_timestamp_tz('061109235500CDT', 'RRMMDDHH24MISSTZR') from dual
ORA-01882: timezone region not found
Even I verified this in the thable V$Timezone_names. Its there.
I don't know, why I am getting ths error.
Please help me in this.
Thanks in advance.
Thank you,
Regards,
Gowtham Sen.
Position based data
Chan, April 15, 2008 - 9:17 am UTC
Tom:
It look like I can create external tables for position based data file also.
Now.. I have my data file organized something like this..
VE2ACEX76F5426800000002299K250404 11160746071 3 2
VF2ACEX76F542682000 00001 90L03967AAA
VG2ACEX76F54268 90L03967 RETURN/REWORK TO SUPPL
--- some more -- data
VE2ACEX76F5426900000002099K250405 11160746071 3 2
VF2ACEX76F542692000 00001 90L03967AAA
VG2ACEX76F54269 90L03967 RETURN/REWORK TO SUPPL
---
lines starting with VE, VF, VG in a data file makes a single record.
The values like 54268 and 54269 are key columns and each line carry this key value.
My question: Can I this kind of data file, where data is scattered into multiple lines with a key value/record in each column ?
How to Load multiple files using External Table
Anan, August 27, 2008 - 3:48 pm UTC
Tom,
Can you help me with external table script to load multiple files text data (data_daily1.txt 1.4GB size, data_daily2.txt 1.1 GB size, data_daily3.txt 2.1 GB size)using external tables.
Thanks,
I am using sqlloader
====================
load data
infile 'data_daily1.txt'
infile 'data_daily2.txt'
infile 'data_daily3.txt'
append
into table daily_upload
fields terminated by ","
optionally enclosed by '"'
trailing nullcols
(col1,
col2,
col3,
col4,
col5)
August 29, 2008 - 1:11 pm UTC
<b>$ sqlldr / test.ctl external_table=generate_only</b>
sqlldr will generate all you need for you..
(test.ctl is your control file, / is your user/password)
I created a table daily_load with 5 columns, all numbers - you would of course have your real types in there... and then
<b>$ cat test.log
</b>
SQL*Loader: Release 10.2.0.2.0 - Production on Fri Aug 29 12:48:05 2008
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Control File: test.ctl
There are 3 data files:
Data File: data_daily1.txt
Bad File: data_daily1.bad
Discard File: none specified
(Allow all discards)
Data File: data_daily2.txt
Bad File: data_daily2.bad
Discard File: none specified
(Allow all discards)
Data File: data_daily3.txt
Bad File: data_daily3.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation: none specified
Path used: External Table
Table DAILY_UPLOAD, loaded from every logical record.
Insert option in effect for this table: APPEND
TRAILING NULLCOLS option in effect
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
COL1 FIRST * , O(") CHARACTER
COL2 NEXT * , O(") CHARACTER
COL3 NEXT * , O(") CHARACTER
COL4 NEXT * , O(") CHARACTER
COL5 NEXT * , O(") CHARACTER
CREATE TABLE statement for external table:
------------------------------------------------------------------------
CREATE TABLE "SYS_SQLLDR_X_EXT_DAILY_UPLOAD"
(
"COL1" NUMBER(38),
"COL2" NUMBER(38),
"COL3" NUMBER(38),
"COL4" NUMBER(38),
"COL5" NUMBER(38)
)
ORGANIZATION external
(
TYPE oracle_loader
DEFAULT DIRECTORY UPLOAD_DIRECTORY
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII
BADFILE 'UPLOAD_DIRECTORY':'data_daily1.bad'
LOGFILE 'test.log_xt'
READSIZE 1048576
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' LDRTRIM
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
"COL1" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"COL2" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"COL3" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"COL4" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"COL5" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
)
)
location
(
'data_daily1.txt',
'data_daily2.txt',
'data_daily3.txt'
)
)REJECT LIMIT UNLIMITED
INSERT statements used to load internal tables:
------------------------------------------------------------------------
INSERT /*+ append */ INTO DAILY_UPLOAD
(
COL1,
COL2,
COL3,
COL4,
COL5
)
SELECT
"COL1",
"COL2",
"COL3",
"COL4",
"COL5"
FROM "SYS_SQLLDR_X_EXT_DAILY_UPLOAD"
How to Load multiple files using External Table
Anan, September 05, 2008 - 2:46 pm UTC
Thanks Tom.
I have another question regarding location.
.....
location
(
'data_daily1.txt',
'data_daily2.txt',
'data_daily3.txt'
)
....
I may have 1 to n daily text files to load instead of using static files.
'data_daily1.txt',
'data_daily2.txt',
'data_daily3.txt',
......
......
'data_dailyn.txt'
Can I use data_daily*.txt instead of static files or suggest alternate method to use multiple text files in location.
Thanks,
September 05, 2008 - 4:59 pm UTC
you would have to use the alter table command to provide the list, no "*" won't work
How to Load multiple files with Header Rows using External Table
Liz, September 12, 2008 - 8:17 am UTC
Hi Tom,
I have a similar issue loading multiple files into one external table except the files all have header rows that I wish to skip.
I have tried using skip 1 but this only seems to skip over the header row of the first file and not all files.
Any ideas how I can do this?
September 16, 2008 - 1:23 pm UTC
is there something you can use to "identify" your header rows - what do they look like.
map all of the fields using varchar2 and use a where clause to filter out the rows you do not want to consider.
How to Load multiple files with Header Rows using External Table
Liz, September 19, 2008 - 7:15 am UTC
The header row is literally just column names...
Account Reference Number,Component Reference Number,CompleteionDate,PreviousCompletionDate,TransferDate,Customer ID,Number of Borrowers...
I have used unix to remove the header rows to resolve this now, I did have a go at using LOAD WHEN, but I wanted to avoid changing the datatypes on the external table if possible.
Thank you
September 19, 2008 - 8:41 am UTC
ops$tkyte%ORA10GR2> create or replace directory tmpdir as '/home/tkyte'
2 /
Directory created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> CREATE TABLE ET
2 (
3 "X" NUMBER(38),
4 "Y" DATE,
5 "Z" VARCHAR2(30)
6 )
7 ORGANIZATION external
8 (
9 TYPE oracle_loader
10 DEFAULT DIRECTORY TMPDIR
11 ACCESS PARAMETERS
12 (
13 RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII
14 LOAD WHEN ((1: 1) != "A")
15 BADFILE 'TMPDIR':'test.bad'
16 LOGFILE 'test.log_xt'
17 READSIZE 1048576
18 FIELDS TERMINATED BY "," LDRTRIM
19 MISSING FIELD VALUES ARE NULL
20 REJECT ROWS WITH ALL NULL FIELDS
21 (
22 "X" CHAR(255)
23 TERMINATED BY ",",
24 "Y" CHAR(255)
25 TERMINATED BY ",",
26 "Z" CHAR(255)
27 TERMINATED BY ","
28 )
29 )
30 location
31 (
32 'test1.dat', 'test2.dat'
33 )
34 )REJECT LIMIT UNLIMITED
35 /
Table created.
and given inputs:
ops$tkyte%ORA10GR2> !cat test1.dat
Account Reference Number,Component Reference Number,CompleteionDate
1,01-jan-2008,hello
2,01-jun-2008,world
ops$tkyte%ORA10GR2> !cat test2.dat
Account Reference Number,Component Reference Number,CompleteionDate
3,11-jan-2008,HELLO
4,11-jun-2008,WORLD
ops$tkyte%ORA10GR2> select * from et;
X Y Z
---------- --------- ------------------------------
1 01-JAN-08 hello
2 01-JUN-08 world
3 11-JAN-08 HELLO
4 11-JUN-08 WORLD
and the discard file would have the headers:
ops$tkyte%ORA10GR2> !cat ET_17048.dsc
Account Reference Number,Component Reference Number,CompleteionDate
Account Reference Number,Component Reference Number,CompleteionDate
Changing a column width
Craig, April 01, 2009 - 12:02 pm UTC
I'd like to further address question #1 of the original post. Is there a way to change a column width on an existing external table?
DROP TABLE et_alter_test PURGE;
create table et_alter_test
(
col_1 VARCHAR2(10)
, col_2 VARCHAR2(10)
)
ORGANIZATION external
(
TYPE oracle_loader
DEFAULT DIRECTORY IN_DIR
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8MSWIN1252
BADFILE 'IN_DIR':'test.bad'
LOGFILE 'IN_DIR':'test.log_xt'
READSIZE 1048576
FIELDS TERMINATED BY '::' LDRTRIM
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
col_1 CHAR(10)
, col_2 CHAR(10)
)
)
location
(
'file.txt'
)
)REJECT LIMIT 0;
Representative data:
hello::1234567890
hi::12345
error::123456789012345678
Everything was working fine until the source of this file decided that col_2 needed to be 20 characters wide instead of 10 (without telling me). So this morning I started getting rejects because the data is wider than the field.
SQL>
SQL> SELECT *
2 FROM et_alter_test;
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-30653: reject limit reached
ORA-06512: at "SYS.ORACLE_LOADER", line 52
So I try to widen the field in the external table.
SQL>
SQL> SELECT c.column_name
2 , c.data_length
3 FROM user_tab_cols c
4 WHERE table_name = 'ET_ALTER_TEST';
COLUMN_NAME DATA_LENGTH
------------------------------ -----------
COL_1 10
COL_2 10
SQL> ALTER TABLE et_alter_test MODIFY col_2 VARCHAR2(20);
Table altered
SQL>
SQL> SELECT c.column_name
2 , c.data_length
3 FROM user_tab_cols c
4 WHERE table_name = 'ET_ALTER_TEST';
COLUMN_NAME DATA_LENGTH
------------------------------ -----------
COL_1 10
COL_2 20
SQL> SELECT *
2 FROM et_alter_test;
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-30653: reject limit reached
ORA-06512: at "SYS.ORACLE_LOADER", line 52
I still get the rejects, because the alter didn't alter the bottom portion (the sql*ldr portion) of the external table definition. A quick look at user_external_tables reveals this. So, short of dropping/recreating the external table, is there a way to make such a modification to an existing external table?
Changing the DDL to read as follows yields the identical result...except that now issuing the alter actually fixes the problem.
...
REJECT ROWS WITH ALL NULL FIELDS
(
col_1 CHAR(10)
, col_2 CHAR(4000)
)
...
SQL> ALTER TABLE et_alter_test MODIFY col_2 VARCHAR2(20);
Table altered
SQL> SELECT *
2 FROM et_alter_test;
COL_1 COL_2
---------- --------------------
hello 1234567890
hi 12345
error 123456789012345678
SQL>
Assuming you can't change a column width, is there any reason at all why I shouldn't always declare the lower portion to be CHAR(4000) and set the upper declarations to the proper width? And if so, why is the default width only 255?
April 01, 2009 - 4:33 pm UTC
by default the char(NN) is 255 in sqlldr/external tables. Why 255? Because it dates back to the beginning of time when strings had smaller sizes - way way way back in time.
having at 4000 will just affect (marginally) the memory needed (a staging buffer).
Changing a column width
Craig, April 01, 2009 - 5:42 pm UTC
So there's no way to change it short of dropping/recreating the table?
April 02, 2009 - 9:18 am UTC
I didn't say that, quick glance at documentation shows
http://docs.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_3001.htm#i2104234 ops$tkyte%ORA10GR2> create table t
2 (
3 col_1 VARCHAR2(10)
4 , col_2 VARCHAR2(10)
5 )
6 ORGANIZATION external
7 (
8 TYPE oracle_loader
9 DEFAULT DIRECTORY IN_DIR
10 ACCESS PARAMETERS
11 (
12 RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8MSWIN1252
13 BADFILE 'IN_DIR':'test.bad'
14 LOGFILE 'IN_DIR':'test.log_xt'
15 READSIZE 1048576
16 FIELDS TERMINATED BY '::' LDRTRIM
17 MISSING FIELD VALUES ARE NULL
18 REJECT ROWS WITH ALL NULL FIELDS
19 (
20 col_1 CHAR(10)
21 , col_2 CHAR(10)
22 )
23 )
24 location
25 (
26 'file.txt'
27 )
28 )REJECT LIMIT 0;
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> alter table t default directory in_dir access parameters
2 (
3 RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8MSWIN1252
4 BADFILE 'IN_DIR':'test.bad'
5 LOGFILE 'IN_DIR':'test.log_xt'
6 READSIZE 1048576
7 FIELDS TERMINATED BY '::' LDRTRIM
8 MISSING FIELD VALUES ARE NULL
9 REJECT ROWS WITH ALL NULL FIELDS
10 (
11 col_1 CHAR(20)
12 , col_2 CHAR(10)
13 )
14 )
15 /
Table altered.
ops$tkyte%ORA10GR2> alter table t modify col_1 varchar2(20);
Table altered.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select dbms_metadata.get_ddl( 'TABLE', 'T' ) from dual;
DBMS_METADATA.GET_DDL('TABLE','T')
-------------------------------------------------------------------------------
CREATE TABLE "OPS$TKYTE"."T"
( "COL_1" VARCHAR2(20),
"COL_2" VARCHAR2(10)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY "IN_DIR"
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8MSWIN1252
BADFILE 'IN_DIR':'test.bad'
LOGFILE 'IN_DIR':'test.log_xt'
READSIZE 1048576
FIELDS TERMINATED BY '::' LDRTRIM
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
col_1 CHAR(20)
, col_2 CHAR(10)
)
)
LOCATION
( 'file.txt'
)
)
How to get external table data from another unix server
Anan, February 02, 2010 - 11:52 am UTC
Hi Tom
Oracle 10g DB is on Unix server A.
External table ou.txt files is on Unix server B.
(ou.txt data is located ABMT_FTP on unix Server B /apps01/abm/ou.txt)
CREATE TABLE T_OU_MAP_EXT
(
EFF_DT DATE,
OU_ID VARCHAR2(12 BYTE),
OU_PRNT_ID VARCHAR2(12 BYTE),
BILL_ENT_ID VARCHAR2(6 BYTE)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY ABMT_FTP
ACCESS PARAMETERS
( records delimited by newline
badfile ABMT_FTP:'ou_ext.bad'
logfile ABMT_FTP:'ou_ext.log'
fields terminated by ','
OPTIONALLY ENCLOSED BY '"'
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS )
LOCATION (ABMT_FTP:'ou.txt')
)
REJECT LIMIT UNLIMITED
NOPARALLEL
NOMONITORING;
How do I do a map the Location to the other Server(B)?
Thanks,
Anan
February 02, 2010 - 12:49 pm UTC
NFS would be a path...
basically, the file system MUST be available to the database server instance - it must be.
Else, you would be asking the database server instance to be able to read the file system of another machine - very virus'y like. It won't happen.
External tables and Multilingual characters
MK, July 02, 2010 - 3:33 pm UTC
Hi,
I am trying to help a client have a better and more efficient way of unloading data from a Microsoft SQL server 2008 database to Oracle 11g at a remote location. They currently use SSIS to do this and have complained about it being extremely slow. I have managed to convince them to use Oracle's high speed flat file loaders instead which do the job much quicker. They upload music related data like music tracks. My question now - Will the external table script that I have written below work to load in track names or artist names which can have any international character sets ranging from Chinese to Hungarian or would I need to do something special to make this work? The target Oracle instance has NLS_CHARSET of AL32UTF8. Also if you have any suggestions on better formatting or use of better options then I would be more than interested to know about them!
-- CREATE OR REPLACE DIRECTORY dmpdir AS '/home/oracle/dpump_dir';
-- grant all on DIRECTORY dmpdir to public;
/* External Table Creation */
CREATE TABLE EXT_TRACK
(
"ID" NUMBER(10) ,
"PRIMARY_GENRE_ID" NUMBER(10) ,
"PRIMARY_ARTIST_ID" NUMBER(10) ,
"PRIMARY_IMAGE_ID" NUMBER(10) ,
"PRIMARY_STORY_ID" NUMBER(10) ,
"SPECIFIC_PRICE_ID" NUMBER(10) ,
"EDITOR_NOTE_ID" NUMBER(10) ,
"RELEASE_ID" NUMBER(10) ,
"SIDELOADED_BY_CUSTOMER_ID" NUMBER(10) ,
"LYRICS_STORY_ID" NUMBER(10) ,
"ARTIST_SET_ID" NUMBER(10) ,
"IMAGE_SET_ID" NUMBER(10) ,
"STORY_SET_ID" NUMBER(10) ,
"SIMILAR_TRACK_SET_ID" NUMBER(10) ,
"RELATED_TRACK_SET_ID" NUMBER(10) ,
"RECORD_LABEL_COMPANY_ID" NUMBER(10) ,
"ENABLED" NUMBER(1) ,
"ENCRYPTED" NUMBER(1) ,
"ADULT_CONTENT" NUMBER(1) ,
"NOT_IN_LIBRARY" NUMBER(1) ,
"SIDELOADED_BY_CUSTOMER" NUMBER(1) ,
"YEAR_RELEASED" NUMBER(4) ,
"BEATS_PER_MINUTE" NUMBER(4) ,
"DISC_NUMBER" NUMBER(4) ,
"DISC_COUNT" NUMBER(4) ,
"TRACK_NUMBER" NUMBER(4) ,
"TRACK_COUNT" NUMBER(4) ,
"TOTAL_TIME_IN_SECONDS" NUMBER(12) ,
"ISRC_CODE" VARCHAR2(16 CHAR) ,
"KIND" VARCHAR2(32 CHAR) ,
"GUID" VARCHAR2(32 CHAR) ,
"RECORD_LABEL_COMPANY_NAME" VARCHAR2(256 CHAR) ,
"TITLE" VARCHAR2(256 CHAR) ,
"RELEASE_TITLE" VARCHAR2(256 CHAR) ,
"ARTIST_NAME" VARCHAR2(256 CHAR) ,
"DESCRIPTION" VARCHAR2(256 CHAR) ,
"P_LINE" VARCHAR2(256 CHAR) ,
"C_LINE" VARCHAR2(256 CHAR) ,
"DATA_CLASSIFICATION" NUMBER(10) ,
"CREATED" TIMESTAMP(3) ,
"INSERTED" TIMESTAMP(3) ,
"MODIFIED" TIMESTAMP(3) ,
"SUB_RECORD_LABEL_COMPANY_NAME" VARCHAR2(256 CHAR) ,
"COMPOSER_SET_ID" NUMBER(10)
)
ORGANIZATION external
(
TYPE oracle_loader
DEFAULT DIRECTORY dmpdir
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE CHARACTERSET AL32UTF8
BADFILE 'xt_track.bad'
LOGFILE 'xt_track.log'
READSIZE 1048576
FIELDS TERMINATED BY '|' OPTIONALLY ENCLOSED BY '"' LDRTRIM
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
"ID" CHAR(300)
TERMINATED BY "|" ,
"PRIMARY_GENRE_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"PRIMARY_ARTIST_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"PRIMARY_IMAGE_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"PRIMARY_STORY_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"SPECIFIC_PRICE_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"EDITOR_NOTE_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"RELEASE_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"SIDELOADED_BY_CUSTOMER_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"LYRICS_STORY_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"ARTIST_SET_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"IMAGE_SET_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"STORY_SET_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"SIMILAR_TRACK_SET_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"RELATED_TRACK_SET_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"RECORD_LABEL_COMPANY_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"ENABLED" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"ENCRYPTED" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"ADULT_CONTENT" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"NOT_IN_LIBRARY" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"SIDELOADED_BY_CUSTOMER" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"YEAR_RELEASED" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"BEATS_PER_MINUTE" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"DISC_NUMBER" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"DISC_COUNT" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"TRACK_NUMBER" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"TRACK_COUNT" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"TOTAL_TIME_IN_SECONDS" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"ISRC_CODE" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"KIND" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"GUID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"RECORD_LABEL_COMPANY_NAME" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"TITLE" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"RELEASE_TITLE" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"ARTIST_NAME" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"DESCRIPTION" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"P_LINE" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"C_LINE" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"DATA_CLASSIFICATION" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"CREATED" CHAR(300) date_format timestamp mask "DD/MM/YYYY HH24:MI:SS.FF",
"INSERTED" CHAR(300) date_format timestamp mask "DD/MM/YYYY HH24:MI:SS.FF",
"MODIFIED" CHAR(300) date_format timestamp mask "DD/MM/YYYY HH24:MI:SS.FF",
"SUB_RECORD_LABEL_COMPANY_NAME" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"',
"COMPOSER_SET_ID" CHAR(300)
TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"'
)
)
location
(
'track.dat'
)
)REJECT LIMIT UNLIMITED PARALLEL
;
Inserting into multiple tables from external file
Shimmy, January 18, 2012 - 11:41 am UTC
Hi,
I have an external file with the following info
ST*850*0001
PO1*1*1*2120251
PO1*2*1*1141332
SE*3*0001
ST*850*0002
PO1*1*2*1382225
PO1*2*1*3444486
PO1*3*1*2463487
PO1*4*1*1146380
SE*5*0002
CREATE TABLE EXTERNAL_FILE
(
EDI_TEXT VARCHAR2(2000)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY EXT
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE
NOBADFILE NODISCARDFILE NOLOGFILE
FIELDS(
EDI_TEXT CHAR(2000)
)
)
LOCATION (EXT:'EDI850.txt')
)
REJECT LIMIT UNLIMITED;
CREATE TABLE ORD_HDR
(ORDER_NUMBER NUMBER(10) NOT NULL PRIMARY KEY);
CREATE TABLE ORD_DTL
(ORDER_NUMBER NUMBER(10) NOT NULL,
ITEM_NUMBER NUMBER(10),
QUANTITY NUMBER(10));
Lines starting with ST denotes the beginning of the order and line starting with SE denotes end of the order.
Lines starting with PO1 indicates order detail info.
The fields are seperated by * .
On the ST and SE lines, the third field is the order number.
On the PO1 line, the third field is the quantity and fourth field is the item number.
Is it possible to just use SQL to populate the ORD_HDR and ORD_DTL detail by reading the external table just once?
I am looking for an output which looks likes the following
SELECT ORDER_NUMBER FROM ORD_HDR;
ORDER_NUMBER
0001
0002
SELECT ORDER_NUMBER FROM ORD_DTL;
ORDER_NUMBER ITEM_NUMBER QUANTITY
0001 2120251 1
0001 1141332 1
0002 1382225 2
0002 3444486 1
0002 2463487 1
0002 1146380 1
Thank you
January 18, 2012 - 12:58 pm UTC
ops$tkyte%ORA11GR2> create or replace directory MY_DIR as '/home/tkyte'
2 /
Directory created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> CREATE TABLE edifact_data
2 (
3 "REC_TYPE" VARCHAR2(20),
4 "EDIFACT_TT" VARCHAR2(20),
5 "FIELD1" VARCHAR2(20),
6 "FIELD2" VARCHAR2(20),
7 "FIELD3" VARCHAR2(20),
8 "FIELD4" VARCHAR2(20)
9 )
10 ORGANIZATION external
11 (
12 TYPE oracle_loader
13 DEFAULT DIRECTORY MY_DIR
14 ACCESS PARAMETERS
15 (
16 RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII
17 BADFILE 'data.bad'
18 LOGFILE 'x.log_xt'
19 READSIZE 1048576
20 FIELDS TERMINATED BY "*" LDRTRIM
21 MISSING FIELD VALUES ARE NULL
22 REJECT ROWS WITH ALL NULL FIELDS
23 (
24 "REC_TYPE" CHAR(255)
25 TERMINATED BY "*",
26 "EDIFACT_TT" CHAR(255)
27 TERMINATED BY "*",
28 "FIELD1" CHAR(255)
29 TERMINATED BY "*",
30 "FIELD2" CHAR(255)
31 TERMINATED BY "*",
32 "FIELD3" CHAR(255)
33 TERMINATED BY "*",
34 "FIELD4" CHAR(255)
35 TERMINATED BY "*"
36 )
37 )
38 location
39 (
40 'data.txt'
41 )
42 )REJECT LIMIT UNLIMITED
43 /
Table created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create table ord_hdr( order_number varchar2(20) );
Table created.
ops$tkyte%ORA11GR2> create table ord_dtl( order_number varchar2(20), item_number varchar2(20), quantity varchar2(20) );
Table created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> insert
2 when rec_type = 'ST' then into ord_hdr(order_number) values ( ONUM )
3 when rec_type = 'PO1' then into ord_dtl(order_number,item_number,quantity) values ( ONUM, field2, field1 )
4 select rec_type,
5 max(case when rec_type = 'ST' then field1 end) over (order by r) onum,
6 field1, field2
7 from (select e.*, rownum r from edifact_data e)
8 /
8 rows created.
ops$tkyte%ORA11GR2> select * from ord_hdr;
ORDER_NUMBER
--------------------
0001
0002
ops$tkyte%ORA11GR2> select * from ord_dtl;
ORDER_NUMBER ITEM_NUMBER QUANTITY
-------------------- -------------------- --------------------
0001 2120251 1
0001 1141332 1
0002 1382225 2
0002 3444486 1
0002 2463487 1
0002 1146380 1
6 rows selected.
Inserting into multiple tables from external file
Shimmy, January 20, 2012 - 10:42 am UTC
Thank you for the quick feedback.
The above query will work only if Order Number field is in the ascending order. If the data in the file is as follows, then it won't work because 0002 is greater than 0001 and the order number does not reset when it reaches "ST*850*0001" line.
ST*850*0002
PO1*1*2*1382225
PO1*2*1*3444486
PO1*3*1*2463487
PO1*4*1*1146380
SE*5*0002
ST*850*0001
PO1*1*1*2120251
PO1*2*1*1141332
SE*3*0001
Can you please offer any suggestion to fix it?
Thank you
January 20, 2012 - 10:46 am UTC
use last_value(column ignore nulls) instead then.
ops$tkyte%ORA11GR2> create table ord_hdr( order_number varchar2(20) );
Table created.
ops$tkyte%ORA11GR2> create table ord_dtl( order_number varchar2(20), item_number varchar2(20), quantity varchar2(20) );
Table created.
ops$tkyte%ORA11GR2> !cat data.txt
ST*850*0002
PO1*1*2*1382225
PO1*2*1*3444486
PO1*3*1*2463487
PO1*4*1*1146380
SE*5*0002
ST*850*0001
PO1*1*1*2120251
PO1*2*1*1141332
SE*3*0001
ops$tkyte%ORA11GR2> insert
2 when rec_type = 'ST' then into ord_hdr(order_number) values ( ONUM )
3 when rec_type = 'PO1' then into ord_dtl(order_number,item_number,quantity) values ( ONUM, field2, field1 )
4 select rec_type,
5 last_value(case when rec_type = 'ST' then field1 end IGNORE NULLS) over (order by r) onum,
6 field1, field2
7 from (select e.*, rownum r from edifact_data e)
8 /
8 rows created.
ops$tkyte%ORA11GR2> select * from ord_hdr;
ORDER_NUMBER
--------------------
0002
0001
ops$tkyte%ORA11GR2> select * from ord_dtl;
ORDER_NUMBER ITEM_NUMBER QUANTITY
-------------------- -------------------- --------------------
0002 1382225 2
0002 3444486 1
0002 2463487 1
0002 1146380 1
0001 2120251 1
0001 1141332 1
6 rows selected.
External Table
A reader, January 26, 2012 - 1:52 am UTC
hi Tom,
I have to create an external table. Now there is an FTP server whose path is provided in util_file_dir.
1. Is it possible to keep the data file in another server in Unix when we have oracle installed on another server. Does creating the logical mapping through CREATE DIRECTORY works? I read about NFS mount point but can a FTP server, with entry in init.ora parameter be equivalent to that?
2. In case it is possible then can we change the names of data files? as we may have multiple data files every day.
Version: Oracle 11g R2
January 26, 2012 - 10:34 am UTC
a) do not use utl_file_dir, it is not secure, inflexible, and considered deprecated
b) how do you have a ftp server path in utl_file_dir??? that doesn't compute, I don't know what that means.
1) you have to have the file available in the file system of the server - via an nfs mount or local file system.
2) yes, you can rename files using utl_file, you can rename the file the external table points to using ALTER TABLE
external table with multiple files and skip
Steve, March 14, 2012 - 11:02 am UTC
Tom,
How is the Skip option suppose to work when you have multiple files associated with the external table?
I have a situation where I would like to read from multiple files all of the same format, but each file would have a header row.
I've tried this below, but it seems to only "skip 1" for the first file in the list and not "skip 1" in each file. Would it be fair to say that the 3 files in this case are glued together and then the skip 1 applied?
Create Or Replace Directory Test_Dir as 'd:\'
/
Drop Table Ext_Test_Tab
/
Create Table Ext_Test_Tab
(
A_Column Varchar2(25),
B_Column Varchar2(30)
)
Organization External
(
Type Oracle_Loader
Default Directory Test_Dir
Access Parameters (
Records Delimited By Newline
Badfile Test_Dir:'ext.bad'
Skip 1
Logfile Test_Dir:'ext.log'
Fields Terminated By ","
Optionally Enclosed By '"'
Missing Field Values Are Null
(
A_Column,
B_Column
)
)
Location ('ext_001.csv')
)
Reject Limit Unlimited
/
/*
File ext_001.csv:
F1 Header 1,F1 Header 2
A,AA
File ext_002.csv:
F2 Header 1,F2 Header 2
B,BB
File ext_003.csv:
F3 Header 1,F3 Header 2
C,CC
*/
Select *
From Ext_Test_Tab
Alter table Ext_Test_Tab Location ('ext_001.csv','ext_002.csv','ext_003.csv')
/
Select *
From Ext_Test_Tab
Thanks
Steve
March 14, 2012 - 3:23 pm UTC
skip N would skip the first N records of the first file.
is there something in the header record you can look at and filter out with a where clause?
Steve, March 15, 2012 - 3:01 am UTC
Tom,
Unfortunately not, the files are created by the user of our software in MS Excel and thus all have headers, but can be different in each file because they are specified by the user. The reason for multiple files is because different areas of their business will all have a separate file per business area, which we then suck in via the external table. The format of the file contains date and number fields also (not just varchar2 as per my example), so the header rows from the 2 other files currently end up in the bad file. Which at a push is ok, just not ideal.
Steve
March 15, 2012 - 7:34 am UTC
the other alternative is to have it point to a file at a time and load each one.
loop while more files
alter table external_table to point to new location
insert into table select * from external_table
end loop
Stop those logfiles !!
Ankit, March 19, 2012 - 9:46 am UTC
Hi Tom
This is in relation to the initial question asked, but I was not able to find what I needed to know, that's why I am posting the question. After creating an external table, every select creates/appends to the log file. Why does this happen ? Can it be stopped once table has created ? I fear that logfile will keep on growing and occupy space which is not at all required.
Thanks a lot
Ankit
March 19, 2012 - 10:33 am UTC
you can use
"nologfile"
instead of logfile 'filename' in your create table.
or, just set up a process to occasional zero out the log file - which might be better as it would give you some diagnostic information if something goes wrong.
Creating Directories with Parameters
Rami, August 04, 2012 - 2:35 am UTC
Hi Tom,
I am trying to create a directory in my code so I can write into a file. However this directory's path I want it to be set using the function z_site_preference('ANYTHING').
I keep on receiving an error when I try:
create or replace directory DIRE_CTORY as z_site_preference('ANYTHING')
1. How would I do it?
2. Can I pass a parameter upon creation of the directory or it is not applicable to do so?
Thanks a lot
August 17, 2012 - 10:08 am UTC
I don't even know what you mean.
If you are asking "can I bind an identifier in DDL", the answer is "absolutely not"
If you have an application that is dynamically creating directories - I have a real problem with your application. It is a HUGE security risk. I would definitely suggest (strongly) that you reconsider this - these directories are on the database server, they are accessed with the Oracle account privileges, they can cause real damage if you mess up. I would not do this under any circumstance.
In fact, NO developer account - no code account would ever be granted an "ANY" privilege - which is needed to create directories.
Load data using external table vs java batch inserts
Lal, March 05, 2013 - 4:22 am UTC
Tom,
We have a module to load data from a file having size of 2GB. This needs to be loaded into a database table, the no of records is around 3,000,000 rows.
Currently its implemented in the app tier (java) to load the data from file and its done as a batch inserts of 200 records each and committed.
This takes around 30 minutes to complete.
I know that external tables are more useful for these type of loads. But to substantiate this i need some stats.
My question is what are the benefits of using external tables over using a java program which does the parsing and batch inserts and commits 200 records per batch.
Also if i use external table, do i need to split the source file into different files of small sizes rather than loading from a big 2GB file?
One more doubt i have is the table to which the data is to be loaded have heavy read operations.
What would be best way to load this table with the new data.
Load data in a temp table and exchange partition to the current table?
How exactly can this be done. What will happen to the read operations when the new partition is added and old partition is deleted?
The file loading will happen on a weekly basis and after the file load the old data needs to be removed and new data has to be shown.
Also while removing the old data, there are some data which is not part of the file load which should not be deleted at all.
As always expecting your precious input on the same.
March 05, 2013 - 11:53 am UTC
I know that external tables are more useful for these type of loads. But to
substantiate this i need some stats.
generate them then!!!! that is what I've been trying to show here for 13+ years. Benchmark, set up a *test*, in your environment with your data, on your hardware.
My question is what are the benefits of using external tables over using a java
program which does the parsing and batch inserts and commits 200 records per
batch.
you can use direct path load - insert /*+ APPEND */ (skips undo, no silly commits)
you can replace a ton of code with a single line of code
it is going to be faster than a slow by slow 200 row commit process.
Also if i use external table, do i need to split the source file into different
files of small sizes rather than loading from a big 2GB file?
no. 2gb / 3,000,000 rows is pretty darn small.
What would be best way to load this table with the new data.
direct path loads, no doubt.
using partitioning would be *great* if you can do that.
What will happen to the read operations when the
new partition is added and old partition is deleted?
already running queries will still see the old data, new queries will see the new data. In effect "nothing happens to them"
Batch load and exchange partition
Lal, March 09, 2013 - 1:30 pm UTC
Hi Tom,
Thanks for your reply.I understand that external table is the best option. If i use insert with append hint (after truncate) will the no of blocks be similar to the normal insert? For making the read operations fast, the no of blocks should be less right?
Regarding the table load i am rephrasing the question as given below.
I have a table T with 20 million rows with data distribution for different codes as
A 10 million
B 5 Million
C 1 million
rest in 1000s of rows for a code
This table data will be queried very frequently and should be available 24*7 for read operations. The data for each code will be refreshed weekly. For example code A data will be refreshed every Monday, Code B every Tuesday like that.
The following are the options we have thought for loading data.
Option 1
Insert into T with an publish indicator for the load data as N, if the load is successful delete the entries for a code
and update the indicator to Y. This can cause issues for code A since delete is a heavy operation.
Option 2
Maintain a separate table T1 and load the data there (passive area) and if the load is successful, copy the rest of the data from T and make T1 the active table. When next load comes T will be truncated and then the same process will be followed.This way the heavy delete operation can be avoided. But code becomes complex to manage the active/passive setup . The application code will refer to a synonym T which points to T or T1 table based on the active/passive logic.
One problem with synonym approach is if for one read transaction the T synonym is referred multiple times, on query can go to T and next query can go to T1 (if a synonym repoint happened).
Option 3
Partition T table data with code (List parition). Load the data to a normal table and if the load is successful exchange data with the corresponding partition in T.
Which option do you suggest, considering the heavy read operations and occationalDML updates on T and considering the availability requirements.
During exchange partition, will the DMLs /query be affected? Will it be resource intensive operation.
Please give your thoughts (new options?).
March 11, 2013 - 8:36 am UTC
If i use insert with append hint (after truncate) will the no of blocks be similar to the normal insert? For making the read operations fast, the no of blocks should be less right?
they can be - because you can use direct path compression (basic compression) alter table t compress; - then next time you direct path load it, it'll load compressed and might be 50% or less the size of the original table.
option 3 wins hands down. 100% I would not consider anything else actually.
queries will not be affected, modifications of course would be - we'd need to lock the partition in question. I'd have to question how you could be doing modifications to data that is just going to be entirely wiped out however, this doesn't even begin to make sense to me.
Batch load and exchange partition
Lal, March 12, 2013 - 8:38 am UTC
Tom,
Thanks very much for your inputs and time.
When we tried exchanging partition with 2 million rows it took 4 minutes. Mostly the time taken was for updating the global index. (Two global indexes, one for PK (an id column) and one for the fetch). Without the global index updation, the time to exchange partition was 4 secs.
Also we faced the following error when we tried fetching data from the table while the exchange partition was happening.
java.sql.SQLException: ORA-00600: internal error code, arguments: [12406], [], [], [], [], [], [], [], [], [], [], []
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:440)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:837)
A simple java program was used to fetched data from the table using Global index in a single session. This was executed as a loop when the exchange partion with global index was happening in a separate session. When we removed the update global index clause and also when the fetch was done in different connections, this error didn't come. But in the same connection just when the exchange completes one select fails with this error. Rest of the selects work. So if we have a connection pool environment, based on the no of connections that many fetches can fail.
Have you come across any such issue? Are we doing some thing wrong? We are using 11.2.0.3 version.
Or this approach is not recommended when the fetches use global index?
I have few more queries.
Ours is an Oltp application, The fetch will be like select * from T where (col1,col2,code) in ((?,?,?),(?,?,?).... like that).
Will partitioning impact the fetch performance?
If i use local index, the exchange partition error is not there. If there can be max of 700 partitions(codes), will local index deteriorate the performance?.
For exchange partion Will the reads be affected because of the global index updates? (Query plan changes? hard parses due to index changes)
We are planning to proceed with the synonym option because of the ORA-600 error. Do you foresee any issues with the synonym approach.?
You recommended the synonym approach for a different scenario of OLTP in the following thread.
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1238800184155 (Search for What about OLTP ?)
This case is different from us in that in this case a full refresh is happening, whereas ours is a selective refresh.
March 12, 2013 - 8:40 am UTC
please utilize support for ora-600's, that is obviously a support issue (the error message says as much).
Batch load and exchange partition
Lal, March 12, 2013 - 8:58 am UTC
Tom,
We will take up the 600 error with support. Can you please answer the following queries.
Ours is an Oltp application, The fetch will be like select * from T where (col1,col2,code) in ((?,?,?),(?,?,?).... like that).
Will partitioning impact the fetch performance?
If i use local index, the exchange partition error is not there. If there can be max of 700 partitions(codes), will local index deteriorate the performance?.
For exchange partion Will the reads be affected because of the global index updates? (Query plan changes? hard parses due to index changes)
Do you foresee any issues with the synonym approach.?
You recommended the synonym approach for a different scenario of OLTP in the following thread.
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1238800184155 (Search for What about OLTP ?)
This case is different from us in that in this case a full refresh is happening, whereas ours is a selective refresh.
March 12, 2013 - 9:35 am UTC
Will partitioning impact the fetch performance?
it will not make it faster, if you index improperly, it could make it much slower. so be careful in your indexing strategy (unless you are partitioning by col1, col2, code - use a global index on col1, col2, code to ensure you don't to MANY index range scans, one each for each partition)
if you have 700 partitions - you could be doing 700 index range scans .
... For exchange partion Will the reads be affected because of the global index
updates? (Query plan changes? hard parses due to index changes) ...
you'll be doing more IO's - yes, your IO's might take longer, you'll be doing CPU stuff and IO stuff during this operation - that could affect performance - absolutely.
there could be some plan invalidation as well.
I don't see how synonyms would do anything for you.
Look - you are hitting a bug, it should not happen, please work with support to get a patch for that. You want your global indexes for queries, you want to use partitions for exchanging data in.
Skip first record for each flat file
Charlie 木匠, February 02, 2015 - 11:15 pm UTC
Hi Tom,
When there are many data files for an External table, could we skip the first record(the header) for each file?
Thanks,
Charlie