Skip to Main Content
  • Questions
  • Optimal order for columns in a table

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, yair.

Asked: December 25, 2011 - 2:57 pm UTC

Last updated: December 29, 2011 - 5:05 pm UTC

Version: 10.2.0.4

Viewed 1000+ times

You Asked

hi tom,
1.I want to know what is the optimal order for columns in a table.
should i :
Put the not null cols first?
the short nullable cols(for example :varchar2(1)) before the long nullable cols(for example :varchar2(100))? or
the nullable cols that in most cases won't probably get null before the other nullable cols? and there are other expert's guidelines?
2.If the row in a table look like that(continuously):
length data length data ..
The varchar2 can't be in a fixed row like that,
so i'm guessing you are using pointers.
so should i distinguish between varchar2 that whould probably grow
to other varchar2 in the order of the cols in a table?
thanks.

and Tom said...

A row is stored in a manner similar to this on the block:

[null flag][length][data][null flag][length][data].......


In order to get to the 3rd column in a table, we have to parse over the first 2 columns (no pointers - just read the row byte by byte).

So, you should put the most frequently accessed columns first in general for performance.

However, there is something to be said for putting the column(s) that are most likely to be null last (they will consume zero bytes - if we hit the end of a row before finding the N'th column - we know the N'th column is NULL - or in 11g with fast add of a not null column with a default value - the default value).

However, there is a convention to put primary keys first in all cases.

However....


there are too many however's and in general - it really isn't going to matter too much. The bit about the columns most frequently accessed first would have the most weight in a warehouse where you are scanning over many millions of rows to build the answer (the repeated overhead of parsing over a lot of columns that are not interesting would add up). In an OLTP system - where you are processing 10's of rows - it wouldn't be as big of a deal, if at all.

In a strange case where you have dozens of columns that are almost always null - putting them last on a table with lots and lots of rows could save you quite a few bytes - but the table would have to be large and they would have to be frequently null (and all of them null)


So, my recommendation would be - put the columns in order that makes the most sense from an aesthetic and design perspective more than anything else.

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

what about varchar2

bob, December 27, 2011 - 8:46 am UTC

The DBA at my shop enforces a standard of putting varchar2 columns last, and likelihood of nulls is not considered. Is there even a real world benefit to this?
Tom Kyte
December 27, 2011 - 9:19 am UTC

nope. If you access those varchar2 columns frequently and do not always want the columns put in front of it - this approach would actually degrade (minimally) performance.

pretty much all data is stored in a "string like" fashion.


A string will have a null flag, a length byte or bytes ( <=250 characters - 1 byte, >250 characters - 2 bytes) followed by the data.

A number will have a null flag/length byte (numbers are 0..22 bytes in length) followed by the varying amount of data.

A binary_float will consume 5 bytes - a leading null flag/length byte followed by 4 bytes of floating point data.

A binary_double will consume 9 bytes - leading null flag/length followed by 8 bytes of floating point data.

and so on. We have to read that length byte to figure out how many bytes of the row constitute that column - read over that data to get to the next length byte and so on.



So, the further down the 'create' list a column is, the longer it takes to retrieve that column.





ops$tkyte%ORA11GR2> @test
ops$tkyte%ORA11GR2> set echo on
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> /*
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> drop table t;
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> declare
ops$tkyte%ORA11GR2>         l_create long := 'create table t ( c1 number';
ops$tkyte%ORA11GR2> begin
ops$tkyte%ORA11GR2>         for i in 2 .. 1000
ops$tkyte%ORA11GR2>         loop
ops$tkyte%ORA11GR2>                 l_create := l_create || ',c'||i||' number default ' || i;
ops$tkyte%ORA11GR2>         end loop;
ops$tkyte%ORA11GR2>         execute immediate l_create || ')';
ops$tkyte%ORA11GR2> end;
ops$tkyte%ORA11GR2> /
ops$tkyte%ORA11GR2> insert into t (c1, c1000 ) select rownum, rownum from all_objects;
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );
ops$tkyte%ORA11GR2> */
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> @trace
ops$tkyte%ORA11GR2> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte%ORA11GR2> begin
  2          for i in 1 .. 10
  3          loop
  4                  for x in ( select c1 from t )
  5                  loop
  6                          null;
  7                  end loop;
  8                  for x in ( select c1000 from t )
  9                  loop
 10                          null;
 11                  end loop;
 12          end loop;
 13  end;
 14  /

PL/SQL procedure successfully completed.


SELECT C1 FROM T

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     10      0.00       0.00          0          0          0           0
Fetch     7230      6.41      15.72     414610     420920          0      722790
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     7241      6.41      15.72     414610     420920          0      722790
********************************************************************************
SELECT C1000 FROM T

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     10      0.00       0.00          0          0          0           0
Fetch     7230      8.66      17.93     421260    3304860          0      722790
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     7241      8.66      17.94     421260    3304860          0      722790




In that case, the overhead we partially the parsing of 1000 columns and the chasing down of the chained row piece (any row with more than 254 columns will be stored in multiple pieces).

If we change 1000 to 250 in the above example bit of code:


SELECT C1 FROM T

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     10      0.00       0.00          0          0          0           0
Fetch     7230      0.62       0.62       1117      94520          0      722790
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     7241      0.62       0.62       1117      94520          0      722790
********************************************************************************
SELECT C250 FROM T
  
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     10      0.00       0.00          0          0          0           0
Fetch     7230      0.96       0.97          7      94520          0      722790
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     7241      0.96       0.97          7      94520          0      722790




there is still a difference - even without the multiple row pieces.



so, if the varchar2 columns are usually always selected, but the columns the DBA makes you put first are only sometimes selected - they've actually penalized your performance.



I'll just say again:

So, my recommendation would be - put the columns in order that makes the most sense from an aesthetic and design perspective more than anything else.

on 250 column

Rajeshwaran Jeyabal, December 28, 2011 - 10:44 am UTC

Tom:

there is still a difference - even without the multiple row pieces

I was repeating your example for 250 columns & i still don't see any difference in IO's or CPU utilization .

Tkprof show's this

********************************************************************************

SQL ID: 99rpn2705s3bk
Plan Hash: 1601196873
SELECT T1 
FROM
 T


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      718      0.07       2.40      10249      10867          0       71753
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      720      0.07       2.40      10249      10867          0       71753

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 91     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
  71753  TABLE ACCESS FULL T (cr=10867 pr=10249 pw=0 time=4857649 us cost=2821 size=358765 card=71753)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  reliable message                                1        0.00          0.00
  enq: KO - fast object checkpoint                1        0.05          0.05
  direct path read                              163        0.04          2.25
  asynch descriptor resize                        1        0.00          0.00
********************************************************************************

SQL ID: ftt836cqgc3s2
Plan Hash: 1601196873
SELECT T250 
FROM
 T


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      718      0.06       1.88      10249      10867          0       71753
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      720      0.06       1.88      10249      10867          0       71753

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 91     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
  71753  TABLE ACCESS FULL T (cr=10867 pr=10249 pw=0 time=4406928 us cost=2839 size=358765 card=71753)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  direct path read                              157        0.03          1.74
  asynch descriptor resize                        1        0.00          0.00
********************************************************************************


Here is the script, I used for Testing.

drop table t purge;

declare
 l_val long :=' create table t(t1 number default 1 ';
begin
 for i in 2..250
 loop
  l_val := l_val||' ,t'||i||' number default '||i ;
 end loop;
  l_val := l_val ||' ) ';
  execute immediate l_val;
end;
/

insert /*+ append */ into t(t1,t250)
select rownum,rownum
from all_objects
/

exec dbms_stats.gather_table_stats(user,'T');

alter session set timed_statistics=true;
alter session set events '10046 trace name context forever, level 12';
begin
 for x in (select t1 from t)
 loop
  null;
 end loop;
 
 for x in (select t250 from t)
 loop
  null;
 end loop;
end;
/

Tom Kyte
December 29, 2011 - 10:57 am UTC

you didn't repeat my example. repeat my example please.

on 250 column

Rajeshwaran Jeyabal, December 28, 2011 - 11:08 am UTC

I ran it again and again & still no major difference.

begin
    for i in 1..10
 loop
  for x in (select t1 from t)
  loop
   null;
  end loop;
  
  for x in (select t250 from t)
  loop
   null;
  end loop;
 end loop; 
end;
/


********************************************************************************
SELECT T1 
FROM
 T


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     10      0.00       0.00          0          0          0           0
Fetch     7180      0.67      20.72     102490     108671          0      717530
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     7191      0.67      20.72     102490     108671          0      717530

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 91     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
  71753  TABLE ACCESS FULL T (cr=10868 pr=10249 pw=0 time=4651985 us cost=2821 size=358765 card=71753)

********************************************************************************
SELECT T250 
FROM
 T


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     10      0.00       0.00          0          0          0           0
Fetch     7180      0.87      22.76     102490     108670          0      717530
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     7191      0.87      22.76     102490     108670          0      717530

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 91     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
  71753  TABLE ACCESS FULL T (cr=10867 pr=10249 pw=0 time=3502290 us cost=2839 size=358765 card=71753)
********************************************************************************

Tom Kyte
December 29, 2011 - 11:02 am UTC

you do not think that since run two with column 250 is 130% of the cpu of the first run, there is no major difference?

That it used 30% more CPU for 3/4ths of a million rows is not a major difference?

That in a data warehouse you might be processing 10's of millions of rows (or more) in a query - 30% more cpu isn't going to be a problem for you?

Are you sure about that?

You just about perfectly replicated what I presented - I myself feel that a 30% difference in CPU utilization is something to consider.


In any case, all I was presenting was the fact that "if you follow a scheme that forces you to put frequently accessed columns last, as the DBA above does, you are not helping performance, you are hindering it"


on 250 column

Rajeshwaran Jeyabal, December 29, 2011 - 1:50 pm UTC

Tom:

I ran it again and again & still no major difference - I mean this by looking at Physical & Logical IO. But i completely missed out the CPU stuff. Thanks for helping me on that.
Tom Kyte
December 29, 2011 - 5:05 pm UTC

Why would it be different on IO???

I said that i did the 250 on just for that reason ... one row piece not chained like the 1000 column example.

The io was supposed to be the same for 250 columns. I was demonstrating the fact the retrieving column 250 is more expensive than column 1