Skip to Main Content
  • Questions
  • Extract a Node value from XML stored under BLOB in Oracle Tables has row 266 million rows

Breadcrumb

XMAS_TREE

The AskTOM team is taking a break over the holiday season, so we're not taking questions or responding to comments, but we might still sneak in a few Office Hours sessions! Thanks for being a member of the AskTOM community.

Question and Answer

Connor McDonald

Thanks for the question, Mani.

Asked: September 04, 2024 - 8:10 am UTC

Last updated: September 17, 2024 - 5:06 am UTC

Version: 19c

Viewed 10K+ times! This question is

You Asked

Hello Tom,

I have a table which is having around 266573283 rows. In that table i have BLOB column which stores a xml data. In that column i need to parse the xml and then get the value of on node. This node may or may not exists in all the rows. I have tried using python (cx_oracle package). The table has been partitioned on the date(Month and Year) the rows has been stored. I'm not able to process the xml. Is there a way or an approach to process the xml data

Note: This table will not grow anymore due to application shutdown.

2018 3543136
2019 3369956
2020 7576397
2021 82413536
2022 123216864
2023 46453394

Thanks & Regards,
Mani R

and Connor said...

The main issue here that the XML functions available to you are designed for XMLTYPE columns not BLOB.

But you can convert the data as you go, eg

SQL> create table t ( pk int, x blob );

Table created.

SQL>
SQL> begin
  2  insert into t values (1,utl_raw.cast_to_raw(
  3  '<employees>
  4    <employee>
  5      <empno>7369</empno>
  6      <ename>SMITH</ename>
  7      <job>CLERK</job>
  8      <hiredate>17-DEC-1980</hiredate>
  9    </employee>
 10  </employees>'
 11  )
 12  );
 13
 14  insert into t values (2,utl_raw.cast_to_raw(
 15  '<employees>
 16    <employee>
 17      <ename>SMITH</ename>
 18      <job>CLERK</job>
 19      <hiredate>17-DEC-1980</hiredate>
 20    </employee>
 21  </employees>'
 22  )
 23  );
 24
 25  end;
 26  /

PL/SQL procedure successfully completed.

SQL>
SQL> select extractvalue(x, '/employees/employee/empno')
  2  from t;
select extractvalue(x, '/employees/employee/empno')
       *
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected - got -


SQL> select pk, extractvalue(xmltype(utl_raw.cast_to_varchar2(x)), '/employees/employee/empno') eno
  2  from t;

        PK ENO
---------- --------------------
         1 7369
         2



For that many rows, this is going to take a long time, but if it is just a one-off operation (ie, application shutdown) then you can dump the results to a table (create-table-as-select) so you have a permanent record.

If the data you want is not present in a large percentage of rows, then you might get some benefits by checking for the tag directly against the blob before converting to XMLTYPE, eg

SQL> select pk, extractvalue(xmltype(utl_raw.cast_to_varchar2(x)), '/employees/employee/empno') eno
  2  from t
  3  where dbms_lob.instr(x,utl_raw.cast_to_raw('empno')) > 0;

        PK ENO
---------- --------------------
         1 7369


Rating

  (2 ratings)

We're not taking comments currently, so please try again later if you want to add a comment.

Comments

Mani R, September 05, 2024 - 6:24 am UTC

Hello Connor,

Thanks for the solution I'll make use of the XMLTYPE conversion and create a new table for the ID and the XMLTYPE and extract the node value.
Connor McDonald
September 17, 2024 - 5:06 am UTC

Glad we could help

More to Explore

CLOB/BLOB/etc

Complete documentation on Securefiles and Large Objects here