RE: To Learn a new thing
Duke Ganote, January 25, 2012 - 3:11 pm UTC
The original query is nicely dense. However, a MODEL approach offers some advantages.
First, I "unwrapped" the original query into a modular (but not MODEL) approach:
WITH
sample_Data AS (
SELECT '123,354354,45353,83457439' AS csv
FROM DUAL
),
data_plus AS (
SELECT csv
, LENGTH(CSV) - LENGTH(REPLACE(CSV, ',', '')) + 1 AS comma_cnt
FROM sample_data
),
data_columnized AS (
SELECT csv, LEVEL as part#
FROM data_plus
CONNECT BY level <= comma_cnt
)
SELECT csv, substr(CSV,
decode(part#,
1,
1,
instr(CSV || ',', ',', 1, part# - 1) + 1),
(instr(CSV || ',', ',', 1, part#) - 1) -
decode(part#,
1,
1,
instr(CSV || ',', ',', 1, part# - 1) + 1) + 1) Value
FROM data_columnized
ORDER BY part#
That final SELECT is still pretty dense code to read.
By contrast, an approach using MODEL may be easier to debug:
(1) you can "see all the parts" without redundant code,
(2) it may be a more familiar recursive style.
In the following query, I used MODEL for the final SELECT. I commented out the "helper" columns, but they're easily available for debugging:
WITH
sample_Data AS (
SELECT '123,354354,45353,83457439' AS csv
FROM DUAL
),
data_plus AS (
SELECT csv
, LENGTH(CSV) - LENGTH(REPLACE(CSV, ',', '')) + 1 AS comma_cnt
FROM sample_data
),
data_columnized AS (
SELECT csv, LEVEL as part#
FROM data_plus
CONNECT BY level <= comma_cnt
)
SELECT csv, part_csv--part#, input_csv, output_csv
FROM data_columnized
MODEL DIMENSION BY (part#)
MEASURES ( csv
, cast(null as varchar2(20)) AS part_csv
, cast(null as varchar2(50)) AS input_csv
, cast(null as varchar2(50)) AS output_csv
, cast(null as number) AS decomma_pos#
)
RULES AUTOMATIC ORDER
( input_csv[part#] = CASE CV(part#)
WHEN 1 THEN csv[cv()]||','
ELSE output_csv[CV()-1]
END
, decomma_pos#[part#] = INSTR(input_csv[cv()],',')
, part_csv[part#] = SUBSTR(input_csv[cv()],1,decomma_pos#[cv()]-1)
, output_csv[part#] = SUBSTR(input_csv[cv()],decomma_pos#[cv()]+1)
)
ORDER BY part#