"RE: So my general principle would be this. Check if things can be done WITHOUT REGEX, if yes skip regexp, if not then use REGEX"
I'd probably start with whatever approach is easiest first, then explore other options if performance is insufficient, or just for a check on the other.
For example, I tried both REGEX and normal SQL. Performance was adequate for data sets similar in size to that claimed for the original question. While REGEX performed a bit slower, it was so much simpler to code, I'd probably use it. I used the REGEX to debug my other approach, which was much more coding and debugging (and still not perfect).
CONVENTIONAL SQLWITH setup2 AS (
select view_name
, text_vc
, TRUNC((length(text_vc)-length(replace(text_vc,'"'))) / 2 ) as cnt
from dba_views_copy
where owner = 'SYS'
), rCTE2 ( view_name, text_vc, cnt, column_name, occurrence#, pp ) AS (
select view_name
, text_vc
, cnt
, ''
, 1
, instr(text_vc,'"',1) pp
FROM setup2
union all
select view_name
, text_vc
, cnt
, case when mod(occurrence#+1,2) = 0 and instr(text_vc,'"',1,occurrence#+1)-pp+1 <= 30
then substr(text_vc,pp,instr(text_vc,'"',1,occurrence#+1)-pp+1)
end as column_name
, occurrence#+1
, instr(text_vc,'"',1,occurrence#+1) pp
from rCTE2
where occurrence# <= cnt*2
) select /*+ gather_plan_statistics */ count(*), count(unique view_name)
, count(unique column_name)
from rCTE2 where column_name is not null;
COUNT(*) COUNT(UNIQUEVIEW_NAME) COUNT(UNIQUECOLUMN_NAME)
-------- ---------------------- ------------------------
43,003 3206 8753
Elapsed: 00:00:05.10
REGEX SQLWITH setup AS (
SELECT view_name, text_vc, regexp_count( text_vc, '"[^"]{1,30}"' ) cnt
FROM dba_views_copy
WHERE owner = 'SYS'
), rCTE ( view_name, text_vc, cnt, column_name, column_id ) AS (
SELECT view_name
, text_vc
, cnt
, regexp_substr( text_VC, '"[^"]{1,30}"', 1, 1) as column_name
, 1
FROM setup
UNION ALL
SELECT view_name
, text_vc
, cnt
, regexp_substr( text_VC, '"[^"]{1,30}"', 1, column_id + 1) as column_name
, column_id + 1
FROM rCTE
WHERE column_id <= cnt
) select /*+ gather_plan_statistics */ count(*)
, count(unique view_name), count(unique column_name)
from rCTE
where column_name is not null;
OUNT(*) COUNT(UNIQUEVIEW_NAME) COUNT(UNIQUECOLUMN_NAME)
------- ---------------------- ------------------------
43,232 3223 8872
Elapsed: 00:00:07.55