Skip to Main Content
  • Questions
  • Using SELECT * combined with WITH-CLAUSE - Bad Practice? View gets compiled with static columns list

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Samuel.

Asked: February 20, 2018 - 10:58 am UTC

Last updated: February 22, 2018 - 4:52 pm UTC

Version: 12c+

Viewed 10K+ times! This question is

You Asked

Hey guys,

I have a question regarding clean SQL Code / Bad Practice around the use of wildcards in SELECT-Statements.
In the provided example I have a base-query with a huge list of columns selected and two (or more) sources I need to have combined by UNION ALL.
For I am a lazy developer and also try to follow the DRY principle (Don't repeat yourself) I thought it would be a great idea to encapsulate information I need in both selects in a WITH-Clause and referencing it by wildcard. This also greatly enhances the readability of the SELECT-Statement in my opinion. I even think that providing the concrete list of columns 3 times would increase the likelyness of bugs.

Question #1: Would you see that approach as bad practice or actual as solid approach towards more readable and clean SQL statements?

I now want to save my query as view so it's easily accessible (and reusable). When I do so I noticed that Oracle will transalte my wildcard-query into a static list on compilation, which leads to the situation that I suddenly have a difference between my Script-Files (which are version-controlled) and my actual database. This might lead to problems when doing code-coverage or other profiling/comparation stuff.

Question #2: What's the reason Oracle changes the original implementation of the view internally?

Question #3: Would you consider differing sourcefiles/database-sources as minor issue when we can achieve more readable and error-prone code in exchange? What's your opinion about the general topic?

Looking forward to your opinions!

Cheers,
Sam

with LiveSQL Test Case:

and Chris said...

Before answering your questions, let's review:

Why is select * bad?

- You're (probably) selecting more columns than you need to. This means transferring more data over the network and less likelihood of using index-only scans or other sub-optimal execution plans.

- If your code uses select * to return data into a fixed list of variables, adding or removing any columns will cause this to break. So every time you change a table's columns you need to update these statements. If you don't need all the columns to start with or new columns you add, this increases your work for no benefit.

Now in your query:

with info as ( 
  select id description_id, 
         prename,  
         name, 
         some, 
         other, 
         information 
  from   person_description
) 
select j.id, j.title, i.* 
from   jedi j 
inner join info i 
on    j.fk_description = i.description_id 
union all 
select s.id, s.title, i.* 
from   sith s 
inner join info i 
on    s.fk_description = i.description_id;


Presumably, you've selected exactly what you want from person_description. And, when you change this, you're editing the query anyway so can review if select * is still what you need. So yes, you could argue that in this case, it's OK. Make sure you're clear about the reasons why you do this!

A1. Everyone has their own pet preferences on what makes SQL "more readable". So all I'll say is:

Be consistent.

Make sure you stick with a formatting style throughout your application.

A2. When you create a view, the database needs to update the data dictionary to store information about it. This includes the columns, their data type and so on.

If you leave a view as select * then when you run many alter table commands you have to rebuild the view too. If this is a widely-used core view, this could trigger a rebuild lots of the data dictionary! By making the columns explicit you bound the limits of what's directly affected.

A3. Any difference between source code and database code becomes something you need to check. And remember that yes, in this case, the fact they don't match is acceptable. The more of these exceptions you create, the more thoroughly you need to document your decisions. And the harder it is to spot genuine differences. It also makes it tougher for new developers to join the project.

And, if for some reason the tables in your dev/test/prod databases have different columns, with select * your view will compile fine. But if a named column is missing you'll get an exception immediately. If this breaks a release, it makes root cause analysis easier!

So in my opinion, you need to have significant benefits to using select * instead of named columns in your views. Personally I think the benefit here is small at best. But see A1. ;)

Rating

  (5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

column order

Racer I., February 21, 2018 - 7:46 am UTC

Hi,

Yet another point :

> And, if for some reason the tables in your dev/test/prod databases have different columns

They can also be the same columns in a different order.

Say in production a table was changed with add column + drop column so the new column is in last position but a freshly set up test db may use the official create table statement which may have that new column elsewhere (say at the position of the old column).

We also occasionally reorder columns to make the ones most likely to be (and remain) null last (since nulls at the end of a row don't take up storage).

regards,
Chris Saxon
February 21, 2018 - 11:04 am UTC

Great point, hadn't thought of that. Releases are often applied in a different order on prod than dev/test too, giving different orders.

You can also break other objects...

John Keymer, February 21, 2018 - 9:46 am UTC

SQL> set echo on
SQL> @a;
SQL> create table xx(a number);

Table created.

SQL> create materialized view xx_mv as select * from xx;

Materialized view created.

SQL> alter table xx add (b number);

Table altered.

SQL> exec dbms_mview.refresh(list=>'xx_mv',method=>'C');
BEGIN dbms_mview.refresh(list=>'xx_mv',method=>'C'); END;

*
ERROR at line 1:
ORA-12008: error in materialized view refresh path
ORA-00913: too many values
ORA-00904: "XX"."B": invalid identifier
ORA-00904: "XX"."B": invalid identifier
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2821
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 3058
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 3017
ORA-06512: at line 1


SQL>

Chris Saxon
February 21, 2018 - 11:02 am UTC

Yes, good point.

Samuel Nitsche, February 21, 2018 - 11:16 am UTC

Thanks for the detailed feedback. Although most of the points (and all follow up comments so far) don't fit to the given example, which explicitly uses WITH-Clause to deal with all the potential wildcard-problems. The WITH-Clause with ordered columns is an essential part of the query and addresses all of the issues around adding/removing/reordering columns of the underlying tables, doesn't it?
Imagine a statement not only containing of 2 selects combined with UNION ALL but maybe 5. Writing down > 20 columns 5 times seems like a terrible approach to me. Therefore my example tries to overcome that problem by a WITH-Clause dealing with all the problems around wildcarding.

So the only remaining problem in that case is the difference between Source-Files and Database-compiled/stored code. And I agree that exceptions should be avoided.
Any ideas how else to adress such a situation (>20 "shared" columns, 5 UNION-ALLS)? I would expect the likelyness of introducing bugs in such a View/Query to be VERY high.
Chris Saxon
February 21, 2018 - 11:28 am UTC

Like I said, if this is a standalone query with many unions, then using * is "probably" OK ;)

If you plan on using it in a view, you can always make the unions another subquery. Then have the final query list the columns:

with info as ( 
  select id description_id, 
         prename,  
         name, 
         some, 
         other, 
         information 
  from   person_description
), unions as (
  select j.id, j.title, i.* 
  from   jedi j 
  inner join info i 
  on    j.fk_description = i.description_id 
  union all 
  select s.id, s.title, i.* 
  from   sith s 
  inner join info i 
  on    s.fk_description = i.description_id
)
  select id, title, description_id, 
         prename,  
         name, 
         some, 
         other, 
         information 
  from   unions;


So now you have at most two places to change the column list, instead of 2+however many union queries there are.

Samuel Nitsche, February 21, 2018 - 11:46 am UTC

Much thanks for that! This solution won't recompile to a static columns-list, so problem #3 is solved, too.

exchange subselects

Racer I., February 22, 2018 - 11:45 am UTC

Hi,

I'd probably try this :
with
ids as ( 
  select j.id, j.title, s.fk_description description_id 
  from   jedi j 
  union all 
  select s.id, s.title, s.fk_description description_id
  from   sith s
)
select i.id, i.title, d.description_id, 
  prename,  
  name, 
  some, 
  other, 
  information 
from ids i
  join person_description d ON (i.description_id = d.description_id)


regards,
Chris Saxon
February 22, 2018 - 4:52 pm UTC

Yep, that works too.

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.