I'm creating an application which inserts huge amount of data into the database. I'm using a native Oracle driver for NodeJS:
https://github.com/oracle/node-oracledb If I use the statements like
"BEGIN insert into my_table values (1, 2, 3); insert into my_table values (4, 5, 6); ... END;"
without bind parameters but just constants as values it takes long time and causes extra shared memory usage (I guess due to prepared statements).
If I use the bind parameters, so I have the statement like this:
"BEGIN insert into my_table values (:p1, :p2, :p3); insert into my_table values (:p4, :p5, :p6); ... END;"
then I have a limited batch size, due to limitation of the maximum number of bind parameters, thus I have to DDOS the database with the huge amount of requests.
Both methods described above are creating huge network traffic, which exceeds amount of data 2-3 times.
So I've found out that with Oracle there is one way to do it fast:
1) create CSV files
2) use SQLLDR or LOAD_DATA statement.
I'm looking for something I have in MsSql or MySql:
1) In MsSql I have a special method for bulk inserting huge amount of data:
https://www.npmjs.com/package/mssql#bulk 2) In MySql, there is a special syntax for bulk inserting data via SQL statement:
https://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-bulk-data-loading.html My question is:
What is the best way to load huge amount of data into Oracle database directly from application without CSV files?
Check out the latest addition to the driver
https://github.com/oracle/node-oracledb/blob/master/doc/api.md#plsqlindexbybinds You can bind to PLSQL associative arrays, and thus (for example) send batches of (say) 500 records at a time.
The performance differences are pretty astounding. eg
Test 1:
=======
variable v1 number
variable v2 number
exec :v1 := 1; :v2 := 2;
set feedback off
select systimestamp from dual;
insert into T values (:v1,:v2);
insert into T values (:v1,:v2);
insert into T values (:v1,:v2);
... (50,000 in total)
select systimestamp from dual;
which yields
SQL> @c:\temp\ins.sql
PL/SQL procedure successfully completed.
SYSTIMESTAMP
-----------------------------------------------
11-FEB-16 02.12.37.827000 PM +08:00
SYSTIMESTAMP
-----------------------------------------------
11-FEB-16 02.12.41.290000 PM +08:00
So about 3.5 seconds
Then creating a script to do it in batches,
Test 2:
=======
select systimestamp from dual;
declare
type list is table of number index by pls_integer;
v1 list; v2 list;
begin
for i in 1 .. 1000 loop
v1(i) := 1; v2(i) := 2;
end loop;
forall i in 1 .. 1000
insert into T values (v1(i),v2(i));
end;
/
(repeated 50 times )
select systimestamp from dual;
SQL> @c:\temp\ins2.sql
SYSTIMESTAMP
----------------------------------------
11-FEB-16 02.15.26.295000 PM +08:00
SYSTIMESTAMP
----------------------------------------
11-FEB-16 02.15.26.377000 PM +08:00
That's 50,000 rows in 0.08 seconds. Not too shabby :-)