Skip to Main Content
  • Questions
  • Problem uploading large excel files to apex_data_parser

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, nasim.

Asked: November 15, 2023 - 8:36 am UTC

Last updated: December 20, 2023 - 8:00 am UTC

Version: oracle 21c, APEX 20.1

Viewed 1000+ times

You Asked

I wrote code that uses apex_data_parser to import data from an Excel file to the database. Everything is fine until I upload a small file.
But when I upload an Excel file with a size of 43 MG, both the uploading time is long and the part of entering the data into the database is very long and finally it gives an error.(A meaningless error like: no data found in uploaded file) How can I solve this problem? I also used the content of this link:

https://stackoverflow.com/questions/60616754/oracle-apex-apex-data-parser-execution-time-explodes-for-big-files

but it had no effect.

The following code for upload the excel file:

select line_number, col001,col002, col003, col004, col005, col006, col007, col008, col009, col010,
                    col011,col012, col013, col014, col015, col016, col017, col018, col019, col020,
                    col021,col022, col023, col024, col025, col026, col027, col028, col029, col030,
                    col031,col032, col033, col034, col035, col036, col037, col038, col039, col040,
                    col041,col042, col043, col044, col045, col046, col047, col048, col049, col050,
                    col051,col052, col053, col054, col055, col056, col057, col058, col059, col060
from apex_application_temp_files f, table( apex_data_parser.parse(
                     p_content                     => f.BLOB_CONTENT,
                     p_store_profile_to_collection => 'IMPORT_NEW',
                     p_file_name                   =>:P11_FILE) ) p
WHERE p.col001='1' or p.col001='2'


and the output sample of the code that is used for the import section in the database:

Insert into E5ASSET
 (CODE,NAME,PERSIANTITLE,UNITPRICE,ISACTIVE,ISPRODUCTION,ISTOOLS,T5TOOLS_ID,SERIALNUMBER,COP,EQPTYPE,E5PARENT_ID,R5DEPT_ID,E5CLASS_ID,E5TYPE_ID,E5CATEGORY_ID,E5PROFILE_ID,
 E5HC_CRITICALITY_ID,E5HC_SHUTDOWNTYPE_ID,M5MFC_ID,E5EQUIPMENTSTATUS_ID,EQUIPMENTVALUE,MODEL,REVISION,E5HC_CURRENCY_ID,E5HC_UOM_ID,COMMISSIONDATE)
 
 Select col003 CODE,col004 NAME,col005 PERSIANTITLE,col006 UNITPRICE,col007 ISACTIVE,col008 ISPRODUCTION,col009 ISTOOLS,q1.id T5TOOLS_ID,col012 SERIALNUMBER,col013 COP,
 col014 EQPTYPE,q2.id E5PARENT_ID,q3.id R5DEPT_ID,q4.id E5CLASS_ID,q5.id E5TYPE_ID,q6.id E5CATEGORY_ID,q7.id E5PROFILE_ID,q8.id E5HC_CRITICALITY_ID,q9.id E5HC_SHUTDOWNTYPE_ID,
 q10.id M5MFC_ID,q11.id E5EQUIPMENTSTATUS_ID,col035 EQUIPMENTVALUE,col036 MODEL,col037 REVISION,q12.id E5HC_CURRENCY_ID,q13.id E5HC_UOM_ID,col042 COMMISSIONDATE 
 
 from apex_application_temp_files f,
                     table ( apex_data_parser.parse (  
                            p_content=> f.blob_content,
                            p_skip_rows=>207833,
                            p_max_rows=>207834,
                            p_file_name=> f.filename)) p
                            left join T5TOOLS q1 on  q1.code=p.col011 and  q1.r5Organization_id=123
                            left join E5EQUIPMENT q2 on  nvl(q2.code,'*')=p.col016 and  q2.r5Organization_id=123
                            left join R5DEPT q3 on  q3.code=p.col018 and  q3.r5Organization_id=123
                            left join E5CLASS q4 on  q4.code=p.col020 and  q4.r5Organization_id=123
                            left join E5TYPE q5 on  q5.code=p.col022 and  q5.r5Organization_id=123
                            left join E5CATEGORY q6 on  q6.code=p.col024 and  q6.r5Organization_id=123
                            left join E5PROFILE q7 on  q7.code=p.col026 and  q7.r5Organization_id=123 
                            left join E5HC q8 on  q8.code=p.col028 and  q8.hc_id=6 and  q8.r5Organization_id=123
                            left join E5HC q9 on  q9.code=p.col030 and  q9.hc_id=7 and  q9.r5Organization_id=123
                            left join M5MFC q10 on  q10.code=p.col032 and  q10.r5Organization_id=123
                            left join E5EQUIPMENTSTATUS q11 on  q11.code=p.col034 and  q11.r5Organization_id=123
                            left join E5HC q12 on  q12.code=p.col039 and  q12.hc_id=1 and  q12.r5Organization_id=123
                            left join E5HC q13 on  q13.code=p.col041 and  q13.hc_id=7 and  q13.r5Organization_id=123



and Connor said...

I'll suggest a different approach because it sounds like you're dealing with some large Excel files here.

"apex_data_parser" must literally examine the blob contents to extract out every single element, its being done with PLSQL.

An alternative might be to use something external the database to convert it to a CSV and then read that via an external table.

Here's an example I did for customer where we used python to convert the Excel to CSV and then read it via external table

excel2csv.py
============
import openpyxl
import csv
import sys

n = len(sys.argv)
if n != 2:
  sys.exit("ERROR: Single file name must be provided")

filename = sys.argv[1]

wb = openpyxl.load_workbook(filename)
sh = wb.active

with sys.stdout as f:
    col = csv.writer(f)
    for row in sh.rows:
       col.writerow([cell.value for cell in row])

excel2csv.bat
=============
@echo off
set SystemRoot=C:\WINDOWS
C:\bin\Python37\python.exe C:\bin\Python37\excel2csv.py %1


external table
===========
SQL> create table my_excel_file 
  2      ( pk      number(4)
  3        ,col1      varchar2(10)
           ...

  9        ,col8 number(2)   )
 10      organization external
 11      ( type oracle_loader
 12        default directory tmp
 13        access parameters
 14        ( records delimited by newline
 15          preprocessor bin: 'excel2csv.bat'
 16          fields terminated by ','
 17         ( pk, col1, ...
 18          ,col8 )
 19       ) location('myfile.xlsx')
 20     ) reject limit unlimited;

Table created.


then just run your query as

select * from my_excel_file

File names etc can be dynamically altered at run time.

More to Explore

CLOB/BLOB/etc

Complete documentation on Securefiles and Large Objects here