Skip to Main Content
  • Questions
  • Copy millions of blobs from one table to the other one

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Heinrich.

Asked: October 07, 2016 - 9:01 am UTC

Last updated: October 10, 2016 - 5:32 am UTC

Version: 11.2.0.4.0

Viewed 1000+ times

You Asked

Hello colleagues,

I've got a "small" problem, I have to copy millions of blobs from one table to another one.

Target table:


PAGE
----------------------
ID NOT NULL NUMBER(19)
DOCUPLOADCODE VARCHAR2(255)
MIMETYPE NOT NULL VARCHAR2(255)
THUMBNAILMIMETYPE VARCHAR2(255)
VERSION NUMBER(19)
SCANPAGEMETADATA_ID NUMBER(19)
PAGEFILE BLOB
THUMBNAIL BLOB
PAGEFILE_CDBREF NUMBER
THUMBNAIL_CDBREF NUMBER


The columns PAGEFILE_CDBREF and THUMBNAIL_CDBREF are referencies to the source table and the columns PAGEFILE and THUMBNAIL contains the BLOBs.

Source table looks like this:


ODMM_CONTENTSTORE
----------------------
ID NOT NULL NUMBER(20)
LOCKEDFORSESSION NUMBER(20)
MEDIAID NUMBER(20)
LASTACCESSDATE NUMBER(20)
CONTENTTYPE NUMBER(10)
HSMLOCATION VARCHAR2(2000)
BFILECONTENT BINARY FILE LOB
GLOBALINDEXEDBLOB BLOB
EMAILINDEXEDBLOB BLOB
EMAILINDEXEDBLOB_T BLOB
INTERMEDIABLOB BLOB
INTERMEDIABLOB_T BLOB


ID is the reference for PAGEFILE_CDBREF and THUMBNAIL_CDBREF and INTERMEDIABLOB is the data for PAGEFILE and THUMBNAIL.

The update looks like this:

update 
 page dst
set 
 thumbnail = (select intermediablob from odmm_contentstore src1 where src1.id = dst.thumbnail_cdbref)
 , pagefile = (select intermediablob from odmm_contentstore src2 where src2.id = dst.pagefile_cdbref);


The table page contains 17 Mio. rows and all of them have to be updated.

Any ideas how I can accelerate this?

Kind regards,
Heinrich

and Connor said...

Options

1) Often a create-table-as-select is (much) faster than an update, so explore and benchmark mapping your update to a CTAS instead

2) If it has to stay as an update, check out dbms_parallel_execute to do the task in concurrent chunks

Hope this helps.

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

LOBs are difficult

Jonathan Lewis, October 09, 2016 - 2:16 pm UTC

The best strategy may depend hugely on whether the target LOBs are basicfile or securefile, and whether there are existing (non-empty) LOB values in the target table, whether the LOBs are cache or nocache, logging or nologging.

There are several ways in which LOB handling will cause serialization and "single-row processing" problems - and you have to know which bottlenecks you need to work around.

The dbms_parallel_execute (or equivalent) strategy is almost inevitably going to be necessary for the best performance.



Connor McDonald
October 10, 2016 - 5:32 am UTC

Thanks for stopping by Jonathan.

Since you didnt :-), I'll add a link to your recent basicfile LOB series for the original poster

https://jonathanlewis.wordpress.com/2016/08/05/basicfile-lobs-1/


More to Explore

CLOB/BLOB/etc

Complete documentation on Securefiles and Large Objects here