1 - get some genuine timings
============================
For example
FOR i IN 1..vn_loopctr LOOP
dbms_output.put_line(systimestamp);
vn_cnt := vn_cnt + 1;
V_SQL := 'INSERT INTO ' || COMB_TABLE_NAME || ' SELECT /*+ parallel (c,15) */ c.* FROM ' || VS_COPY_TABLE|| ' c WHERE c.comb_id between ' || VN_LAST || ' and ' || CHUNK_SIZE * VN_CNT ;
EXECUTE IMMEDIATE V_SQL;
COMMIT;
dbms_output.put_line(systimestamp);
v_sql := 'analyze table '||COMB_TABLE_NAME||' compute statistics';
EXECUTE IMMEDIATE V_SQL;
So you can get elapsed times for every component. Check out Logger for something more rigourous
https://github.com/OraOpenSource/Logger 2 - get some good trace data
============================
When you've located the part that is running slowly, add this:
dbms_monitor.session_trace_enable(waits=>true,binds=>true);
[then your slow code]
dbms_monitor.session_trace_disable;
3 - examine the tkprof formatted trace file
============================
From here you'll get executions plan, sql elapsed times etc and you'll be equipped to do some tuning
This will give you the best data to work with. Some things that look like potential options
a) parallel dml (your queries are parallel but your inserts are not)
b) direct load using the APPEND hint
c) dont use "analyze", its obsolete
d) use dbms_stat.set_table_stats using the cumulative sql%rowcounts to avoid a full dbms_stats run