Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Fahd.

Asked: November 30, 2014 - 9:11 am UTC

Last updated: November 30, 2014 - 10:46 pm UTC

Version: 11.2.0.1.0

Viewed 1000+ times

You Asked

Hi Tom,

You say to write good SQL, think in SETS and Algorithm at the same time. Please explain how can we think in SETS ? what is the difference in thinking in SETS and not thinking in SETS ?

Thanks

and Tom said...

Not thinking in sets is like thinking in slow by slow, row by row ways. If you hear yourself saying/thinking "for each record do this, then do that, then do the other thing". You should be thinking - we need to do this to everything, then that, then the other thing.


You could load a warehouse by reading a record, verifying the foreign keys are all good in that record, validating each field, and then inserting.

What you should do is

load everything by running a query using outer joins to verify foreign keys and validating each field


for example, you have an external table mapped to a gzipped compressed file. Your job is to load said file, verify data integrity, maybe even transform some data.

You could:

for x in (select * from external_table)
loop
   check this
   check that
   check the other thing
   if (all_ok)
      insert record
   else 
      log bad record
   end if
end loop



that would be slow by slow. It would take linearly longer for every extra record you add.

Think about doing it this way: Instead of doing something for each record, you are doing something to all records.

insert /*+ append */ all
when (flag1+flag2+flag3+flag4+flag5 = 0) then insert production_table (all of the et columns)
else then insert into error_logging_table (all of the et columns and the flags)
select et.*, 
       case when lookup1.key is null then 1 else 0 end flag1,
       case when lookup2.key is null then 1 else 0 end flag2,
       case when lookup2.key is null then 1 else 0 end flag3,
       count(*) over (partition by unique_column) flag4,
       case when (a+b+c > 500) then 1 else 0 end flag5,
       .......
  from et, lookup1, lookup2, lookup3
 where .....


here we decompress, verify/validate/transform if you want in the select, load and compress (direct path load, we can hybrid columnar compress on engineered systems) all in one step.

If you are using the word "for" a lot when thinking about your large processes, you are thinking procedurally. get out of that habit.


Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Important but I'd elaborate a bit...

A reader, November 30, 2014 - 9:01 pm UTC

"Think in SETS" is critical and it's great you repeat it and explain it like this.

The only thing I would suggest you build out the example with real tables and listings. Also, you might want to deconstruct the implicit cursor loop into the more verbose but more concrete declaration of a cursor.
Tom Kyte
November 30, 2014 - 10:46 pm UTC

I've done that so many times... On this site... In my presentations... during seminars... over and over...

google up:

site:asktom.oracle.com "slow by slow"



any everything is a cursor - an update takes a cursor. same with delete. select also.


Many people confuse the term "cursor" with something procedural. Procedural is procedural (row by row, slow by slow). You need a cursor to do that. You also need a cursor to do what I did above.

It is your algorithm that counts. Is it slow by slow, driven by logic that works with hundreds/thousands of records? Or is it built for speeds up to Mach 5?

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.