Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, John.

Asked: April 29, 2008 - 3:10 pm UTC

Last updated: April 30, 2008 - 1:03 pm UTC

Version: 10.2.0

Viewed 1000+ times

You Asked

I'm having a problem with External tables. Most of the time I can read the data in the table without any issue. But when I issue the query below I get an error.


Contents of test.csv file
Tool type,ID
Common,1
Type1,2
Type2,3



  CREATE TABLE "TEST_IMPORT" 
     ( "MACHINE" VARCHAR2(20 BYTE), 
   "ID" NUMBER
     ) 
     ORGANIZATION EXTERNAL 
      ( TYPE ORACLE_LOADER
        DEFAULT DIRECTORY "IMPORT_DIR"
        ACCESS PARAMETERS
        ( records delimited by newline  skip 1
          fields terminated by ','
          optionally enclosed by '"'                                             )
        LOCATION
         ( "IMPORT_DIR":'test.csv'
         )
      )
    ;
 
 
 
 
SQL> select count(*) from test_import;
 
 COUNT(*)               
 ---------------------- 
 3                      
 
 1 rows selected

SQL> create table perm_tbl as select * from test_import where upper(machine) in
('COMMON','TYPE1');

Table created.
 
SQL> desc perm_tbl;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------

 MACHINE                                            VARCHAR2(20)
 ID                                                 NUMBER



select * from (
select id from perm_tbl where upper(machine) in ('COMMON','TYPE1')
);

ID           
---------------------- 
1                      
2                      

2 rows selected


SQL> select * from perm_tbl where id not in
  2  (select id from test_import where upper(machine) in ('COMMON','TYPE2'));
select * from perm_tbl where id not in
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-30653: reject limit reached
ORA-06512: at "SYS.ORACLE_LOADER", line 52


Log file:
  LOG file opened at 04/29/08 13:08:03

Field Definitions for table TEST_IMPORT
  Record format DELIMITED BY NEWLINE
  Data in file has same endianness as the platform
  Rows with all null fields are accepted

  Fields in Data Source: 

    MACHINE                         CHAR (255)
      Terminated by ","
      Enclosed by """ and """
      Trim whitespace same as SQL Loader
    ID                    CHAR (255)
      Terminated by ","
      Enclosed by """ and """
      Trim whitespace same as SQL Loader
error processing column ID in row 1 for datafile C:\test.csv
ORA-01722: invalid number

BAD file:
Tool type,ID


and Tom said...

you are assuming an order of operation in SQL processing and you cannot, we can evaluate anything in any order we want.


SQL> create table perm_tbl as select * from test_import where upper(machine) in
('COMMON','TYPE1');


we apparently evaluated "where uppper() in (...)" before selecting machine and id - and hence converting the string into a number for ID - and therefore did NOT fail.


for:

SQL> select * from perm_tbl where id not in
2 (select id from test_import where upper(machine) in ('COMMON','TYPE2'));


well, one way to process that query is:

select p.*
from perm_tbl p, test_import t
where p.id = t.id(+)
and t.machine in ('COMMON','TYPE2')
and t.id is null;


(an anti-join), there are other ways as well - but as you can see, the SQL you "get" is not necessary the sql you wrote - they are equivalent, but different.

It is all about the plan... And the accidental order of operation.

Here is a potential plan for your query:

ops$tkyte%ORA10GR2> select * from perm_tbl where id not in
  2  (select id from test_import where upper(machine) in ('COMMON','TYPE2'));

-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    25 |     4   (0)| 00:00:01 |
|*  1 |  FILTER                     |             |       |       |            |          |
|   2 |   TABLE ACCESS FULL         | PERM_TBL    |     1 |    25 |     2   (0)| 00:00:01 |
|*  3 |   EXTERNAL TABLE ACCESS FULL| TEST_IMPORT |     2 |    50 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST_IMPORT" "TEST_IMPORT" WHERE
              (UPPER("MACHINE")='COMMON' OR UPPER("MACHINE")='TYPE2') AND LNNVL("ID"<>:B1)))
   3 - filter((UPPER("MACHINE")='COMMON' OR UPPER("MACHINE")='TYPE2') AND
              LNNVL("ID"<>:B1))


Note
-----
   - dynamic sampling used for this statement



see the filter on step 3? that is why...

if you know that when machine in (values) means ID is a number, you would:

case when upper(machine) in ('A','B') then to_number(id) end

and map ID as a STRING - which is apparently is...




Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

John Lavoie, April 30, 2008 - 12:41 pm UTC

No. Looking at the BAD FILE, the line in the CSV file that Oracle is choking on is the HEADER line.

If we remove that from the CSV file and remove the SKIP 1 from the ACCESS PARAMETERS, then the query works correctly.


Contents of test.csv
Common,1
Type1,2
Type2,3

External table definition
  CREATE TABLE "TEST_IMPORT" 
   ( "MACHINE" VARCHAR2(20 BYTE), 
 "ID" NUMBER
   ) 
   ORGANIZATION EXTERNAL 
    ( TYPE ORACLE_LOADER
      DEFAULT DIRECTORY "IMPORT_DIR"
      ACCESS PARAMETERS
      ( records delimited by newline
          fields terminated by ','
          optionally enclosed by '"' 
    )
      LOCATION
       ( "IMPORT_DIR":'test.csv'
       )
    )
  ;

Same query as before
SQL> select * from perm_tbl where id not in
  (select id from test_import where upper(machine) in ('COMMON','TYPE2'))


MACHINE              ID                     
-------------------- ---------------------- 
Type1                2                      

1 rows selected

Tom Kyte
April 30, 2008 - 1:03 pm UTC

ahh, I see - sorry about that.

That will be a 'bug' apparently - it does not reproduce in 11gR1 but does in 9i and 10g.

please contact support and reference bug 4380703

clarification on the query

Amit Poddar, April 30, 2008 - 7:48 pm UTC

Hi,

In the case of above query, I have updated perm_table id to be null for both the rows then I run the same query

SQL> update perm_tbl set id=null;

2 rows updated.

SQL> commit;

Commit complete.

SQL> alter session set statistics_level=all;

Session altered.

SQL> select * from perm_tbl where id not in
   (select id from test_import where upper(machine) in ('COMMON','TYPE2'))
  2    3  
SQL> /

no rows selected

SQL> set linesize 150;
SQL> set pagesize 4000;
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  58fff37ugs9t7, child number 2
-------------------------------------
select * from perm_tbl where id not in    (select id from test_import where upper(machine)
in ('COMMON','TYPE2'))

Plan hash value: 3425822930

-----------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------
|*  1 |  FILTER                     |             |      1 |        |      0 |00:00:00.06 |     118 |
|   2 |   TABLE ACCESS FULL         | PERM_TBL    |      1 |      2 |      2 |00:00:00.01 |       3 |
|*  3 |   EXTERNAL TABLE ACCESS FULL| TEST_IMPORT |      1 |      2 |      1 |00:00:00.06 |     115 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( IS NULL)
   3 - filter(((UPPER("MACHINE")='COMMON' OR UPPER("MACHINE")='TYPE2') AND LNNVL("ID"<>:B1)))



1. Can you please explain the filter 1 i.e. filter (IS NULL)
2. Why is the external table in step 3 starts only once instead of twice since the number of rows returned in step 2is 2.
3. Since at lease one id is null in perm_table, the result of not in will be no rows selected, so oracle can get away without accessing the external table at all (it looks like that this optimization is not implemented, is that correct)

More to Explore

Hints

Need more information on hints? Check out the complete hint descriptions