Hi,
I'm working with migrating data into a system using a migration tool supplied by the vendor of the target system.
The tool uses a predefined schema and we are getting the data in non normalized csv format. So we need to transform
the data into the migration schema.
So here I have a table
create table t ( ID int, Prop1 varchar2(30), Prop2 varchar2(30), Prop3 varchar2(30), Prop4 varchar2(30) );
with this data in it
insert into t values ( 12345, 'abc', 'abc', 'abc%20%def', 'def' );
I need to transform that into this: (The value in Prop3 is splited on '%20%)
ID PropName Value
12345 Prop1 abc
12345 Prop2 abc
12345 Prop3 abc
12345 Prop3 def
12345 Prop4 def
Today I use:
CURSOR cur_temp
IS
SELECT *
FROM t UNPIVOT (PROPVALUE FOR PROPNAME IN(
PROP1,PROP2,PROP3,PROP4));
Then I BULK COLLECT that into a table type
process it with PL/SQL using this nice query that I found on this site
(select substr (x,
instr (x, %20%, 1, level ) + 4,
instr (x, %20%, 1, level+1) - instr (x, %20%, 1, level) - 4)
as IBA_SINGLE_VAL
from (select %20%||l_org_att_wtpart(indx).STRINGVALUE||%20% as x from dual)
connect by instr (x, %20%, 1, level+1) > 0)
Then I use FORALL to insert the values.
The current solution works, but has not been tested on larger data sets. In an up coming migration we have a data
set where the table t will have around 2 miljon records and 20 prop columns so I a little concerned about performance.
I know your clear recommendations is to first try to use a single SQL statement and if that fails use PL/SQL.
Is this possible to do in SQL?
Thanks
Anders
Sure it's possible in one statement. All you need to do is pass the unpivoted data to an adapted csv-to-rows solution. Look for %20% instead of commas.
Here's an example using regular expressions:
create table t (
ID int, Prop1 varchar2(30), Prop2 varchar2(30),
Prop3 varchar2(30), Prop4 varchar2(30)
);
insert into t values ( 12345, 'abc', 'abc', 'abc%20%def', 'def' );
with rws as (
select level x from dual
connect by level <= 10 -- ensure >= number of %20% occurrences
), unpiv as (
select *
from t unpivot (
propvalue for propname in(prop1,prop2,prop3,prop4)
)
)
select regexp_count(propvalue, '%20%'),
regexp_substr(propvalue, '[^(%20%)]+', 1, x) val,
unpiv.*
from unpiv
join rws
on regexp_count(propvalue, '%20%') + 1 >= x
order by id, propname, val;
REGEXP_COUNT(PROPVALUE,'%20%') VAL ID PROPNAME PROPVALUE
0 abc 12345 PROP1 abc
0 abc 12345 PROP2 abc
1 abc 12345 PROP3 abc%20%def
1 def 12345 PROP3 abc%20%def
0 def 12345 PROP4 def
You can find many other csv-to-rows examples on this site.