Skip to Main Content
  • Questions
  • Extract a Node value from XML stored under BLOB in Oracle Tables has row 266 million rows

Breadcrumb

We're taking a break this week as Connor and Chris will both be at CloudWorld, the premier Oracle conference of 2024. If you're in Vegas, please come say Hi or pop into our sessions

Question and Answer

Connor McDonald

Thanks for the question, Mani.

Asked: September 04, 2024 - 8:10 am UTC

Last updated: September 05, 2024 - 4:23 am UTC

Version: 19c

Viewed 100+ times

You Asked

Hello Tom,

I have a table which is having around 266573283 rows. In that table i have BLOB column which stores a xml data. In that column i need to parse the xml and then get the value of on node. This node may or may not exists in all the rows. I have tried using python (cx_oracle package). The table has been partitioned on the date(Month and Year) the rows has been stored. I'm not able to process the xml. Is there a way or an approach to process the xml data

Note: This table will not grow anymore due to application shutdown.

2018 3543136
2019 3369956
2020 7576397
2021 82413536
2022 123216864
2023 46453394

Thanks & Regards,
Mani R

and Connor said...

The main issue here that the XML functions available to you are designed for XMLTYPE columns not BLOB.

But you can convert the data as you go, eg

SQL> create table t ( pk int, x blob );

Table created.

SQL>
SQL> begin
  2  insert into t values (1,utl_raw.cast_to_raw(
  3  '<employees>
  4    <employee>
  5      <empno>7369</empno>
  6      <ename>SMITH</ename>
  7      <job>CLERK</job>
  8      <hiredate>17-DEC-1980</hiredate>
  9    </employee>
 10  </employees>'
 11  )
 12  );
 13
 14  insert into t values (2,utl_raw.cast_to_raw(
 15  '<employees>
 16    <employee>
 17      <ename>SMITH</ename>
 18      <job>CLERK</job>
 19      <hiredate>17-DEC-1980</hiredate>
 20    </employee>
 21  </employees>'
 22  )
 23  );
 24
 25  end;
 26  /

PL/SQL procedure successfully completed.

SQL>
SQL> select extractvalue(x, '/employees/employee/empno')
  2  from t;
select extractvalue(x, '/employees/employee/empno')
       *
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected - got -


SQL> select pk, extractvalue(xmltype(utl_raw.cast_to_varchar2(x)), '/employees/employee/empno') eno
  2  from t;

        PK ENO
---------- --------------------
         1 7369
         2



For that many rows, this is going to take a long time, but if it is just a one-off operation (ie, application shutdown) then you can dump the results to a table (create-table-as-select) so you have a permanent record.

If the data you want is not present in a large percentage of rows, then you might get some benefits by checking for the tag directly against the blob before converting to XMLTYPE, eg

SQL> select pk, extractvalue(xmltype(utl_raw.cast_to_varchar2(x)), '/employees/employee/empno') eno
  2  from t
  3  where dbms_lob.instr(x,utl_raw.cast_to_raw('empno')) > 0;

        PK ENO
---------- --------------------
         1 7369


Rating

  (2 ratings)

We're not taking comments currently, so please try again later if you want to add a comment.

Comments

Mani R, September 05, 2024 - 6:24 am UTC

Hello Connor,

Thanks for the solution I'll make use of the XMLTYPE conversion and create a new table for the ID and the XMLTYPE and extract the node value.

More to Explore

CLOB/BLOB/etc

Complete documentation on Securefiles and Large Objects here