I wrote code that uses apex_data_parser to import data from an Excel file to the database. Everything is fine until I upload a small file.
But when I upload an Excel file with a size of 43 MG, both the uploading time is long and the part of entering the data into the database is very long and finally it gives an error.(A meaningless error like: no data found in uploaded file) How can I solve this problem? I also used the content of this link:
https://stackoverflow.com/questions/60616754/oracle-apex-apex-data-parser-execution-time-explodes-for-big-files but it had no effect.
The following code for upload the excel file:
select line_number, col001,col002, col003, col004, col005, col006, col007, col008, col009, col010,
col011,col012, col013, col014, col015, col016, col017, col018, col019, col020,
col021,col022, col023, col024, col025, col026, col027, col028, col029, col030,
col031,col032, col033, col034, col035, col036, col037, col038, col039, col040,
col041,col042, col043, col044, col045, col046, col047, col048, col049, col050,
col051,col052, col053, col054, col055, col056, col057, col058, col059, col060
from apex_application_temp_files f, table( apex_data_parser.parse(
p_content => f.BLOB_CONTENT,
p_store_profile_to_collection => 'IMPORT_NEW',
p_file_name =>:P11_FILE) ) p
WHERE p.col001='1' or p.col001='2'
and the output sample of the code that is used for the import section in the database:
Insert into E5ASSET
(CODE,NAME,PERSIANTITLE,UNITPRICE,ISACTIVE,ISPRODUCTION,ISTOOLS,T5TOOLS_ID,SERIALNUMBER,COP,EQPTYPE,E5PARENT_ID,R5DEPT_ID,E5CLASS_ID,E5TYPE_ID,E5CATEGORY_ID,E5PROFILE_ID,
E5HC_CRITICALITY_ID,E5HC_SHUTDOWNTYPE_ID,M5MFC_ID,E5EQUIPMENTSTATUS_ID,EQUIPMENTVALUE,MODEL,REVISION,E5HC_CURRENCY_ID,E5HC_UOM_ID,COMMISSIONDATE)
Select col003 CODE,col004 NAME,col005 PERSIANTITLE,col006 UNITPRICE,col007 ISACTIVE,col008 ISPRODUCTION,col009 ISTOOLS,q1.id T5TOOLS_ID,col012 SERIALNUMBER,col013 COP,
col014 EQPTYPE,q2.id E5PARENT_ID,q3.id R5DEPT_ID,q4.id E5CLASS_ID,q5.id E5TYPE_ID,q6.id E5CATEGORY_ID,q7.id E5PROFILE_ID,q8.id E5HC_CRITICALITY_ID,q9.id E5HC_SHUTDOWNTYPE_ID,
q10.id M5MFC_ID,q11.id E5EQUIPMENTSTATUS_ID,col035 EQUIPMENTVALUE,col036 MODEL,col037 REVISION,q12.id E5HC_CURRENCY_ID,q13.id E5HC_UOM_ID,col042 COMMISSIONDATE
from apex_application_temp_files f,
table ( apex_data_parser.parse (
p_content=> f.blob_content,
p_skip_rows=>207833,
p_max_rows=>207834,
p_file_name=> f.filename)) p
left join T5TOOLS q1 on q1.code=p.col011 and q1.r5Organization_id=123
left join E5EQUIPMENT q2 on nvl(q2.code,'*')=p.col016 and q2.r5Organization_id=123
left join R5DEPT q3 on q3.code=p.col018 and q3.r5Organization_id=123
left join E5CLASS q4 on q4.code=p.col020 and q4.r5Organization_id=123
left join E5TYPE q5 on q5.code=p.col022 and q5.r5Organization_id=123
left join E5CATEGORY q6 on q6.code=p.col024 and q6.r5Organization_id=123
left join E5PROFILE q7 on q7.code=p.col026 and q7.r5Organization_id=123
left join E5HC q8 on q8.code=p.col028 and q8.hc_id=6 and q8.r5Organization_id=123
left join E5HC q9 on q9.code=p.col030 and q9.hc_id=7 and q9.r5Organization_id=123
left join M5MFC q10 on q10.code=p.col032 and q10.r5Organization_id=123
left join E5EQUIPMENTSTATUS q11 on q11.code=p.col034 and q11.r5Organization_id=123
left join E5HC q12 on q12.code=p.col039 and q12.hc_id=1 and q12.r5Organization_id=123
left join E5HC q13 on q13.code=p.col041 and q13.hc_id=7 and q13.r5Organization_id=123
I'll suggest a different approach because it sounds like you're dealing with some large Excel files here.
"apex_data_parser" must literally examine the blob contents to extract out every single element, its being done with PLSQL.
An alternative might be to use something external the database to convert it to a CSV and then read that via an external table.
Here's an example I did for customer where we used python to convert the Excel to CSV and then read it via external table
excel2csv.py
============
import openpyxl
import csv
import sys
n = len(sys.argv)
if n != 2:
sys.exit("ERROR: Single file name must be provided")
filename = sys.argv[1]
wb = openpyxl.load_workbook(filename)
sh = wb.active
with sys.stdout as f:
col = csv.writer(f)
for row in sh.rows:
col.writerow([cell.value for cell in row])
excel2csv.bat
=============
@echo off
set SystemRoot=C:\WINDOWS
C:\bin\Python37\python.exe C:\bin\Python37\excel2csv.py %1
external table
===========
SQL> create table my_excel_file
2 ( pk number(4)
3 ,col1 varchar2(10)
...
9 ,col8 number(2) )
10 organization external
11 ( type oracle_loader
12 default directory tmp
13 access parameters
14 ( records delimited by newline
15 preprocessor bin: 'excel2csv.bat'
16 fields terminated by ','
17 ( pk, col1, ...
18 ,col8 )
19 ) location('myfile.xlsx')
20 ) reject limit unlimited;
Table created.
then just run your query as
select * from my_excel_file
File names etc can be dynamically altered at run time.