Skip to Main Content
  • Questions
  • Table columns position and performance impact

Breadcrumb

We're taking a break this week as Connor and Chris will both be at CloudWorld, the premier Oracle conference of 2024. If you're in Vegas, please come say Hi or pop into our sessions

Question and Answer

Tom Kyte

Thanks for the question, Vipin.

Asked: January 04, 2011 - 5:41 pm UTC

Last updated: February 06, 2012 - 9:05 pm UTC

Version: 10.2.0.4

Viewed 1000+ times

You Asked

Hi Tom,

I have a question "Do columns position in a table have any impact on overall query performance?"
If we create table
-- create table_A (col1 varchar2(2000), col_date date, col_id number);
-- create table_b (col_id number, col_date date, col1 varchar2(2000));

Do you think query will perform better with table_B when it will have millions of records as compared to table_A?

Thanks,

Vipin


and Tom said...

It depends.


If you always access col1 and sometimes do not access col_date, col_id or just col_id - then TABLE_A would be your best bet.

If you always access col_id and or col_date - but only sometimes access col1 - then TABLE_B would be.

The difference in this case would be minuscule. It would come from the fact that in order to get the i'th column from a row - we have to parse over all of the preceding fields. Also, if the row was chained - and we needed the last column - we'd have to do an additional IO to retrieve it.

But in general, the difference is so tiny, so unmeasurable in real life that I would not worry about it. Put the columns in the order that makes the most sense.


For example - most people - pretty much all people - would put the primary key FIRST in the table. It is a universally accepted approach - convention.

But, we frequently query:


select first_name, last_name, address from table where primary_key = :x;

Here - we don't access the primary key in the table (we use an index to find rowids). We force the database however to parse over the primary_key column (because it is first) to get to the other attributes. Therefore - there would be a reason for putting the primary key LAST in your table.

On the other hand... If you put the primary key last - you'll make it so that you never have trailing null columns in any row (the last column will ALWAYS be not-null) and we won't be able to save a row without the trailing null columns (if the last N columns of a given row are NULL - we use zero bytes of storage for them - if the last column however is NOT NULL - we have to use storage for EVERY column in the table)


So - please don't go move your primary keys - the performance differential would be teeny tiny and the potential downside could be storage related.


I would put columns LAST in the table that would be mostly NULL - putting the column(s) that have the highest probability of being null last can make the rows take a little less space.

Try to put the most frequently accessed columns in the first 255 columns of the table create if you can and are using more than 255 columns (since every row is chained if you use more than 255 columns - with the additional columns above 255 (and every set of 255 columns) is stored on a different rowpiece somewhere else in the table).


Rating

  (6 ratings)

We're not taking comments currently, so please try again later if you want to add a comment.

Comments

A reader, January 05, 2011 - 12:35 pm UTC

"if the last column however is NULL - we have to use storage for EVERY column in the table) "

you mean to say NOT NULL in above right?

Thanks





Tom Kyte
January 05, 2011 - 12:45 pm UTC

correct and... corrected!

Date when column was added ??

Ankit, January 24, 2012 - 11:25 pm UTC

Hi Tom
I need to know when a particular column was added to a table.
I know about dba_objects which give the time of last DDL but that didn't solve my purpose.
I came across last_analyzed column in dba_tab_columns, but as the name suggests, its gives the date when that column was last analyzed. I added a column on 24 Jan 2012. And this column was also found set to 24 Jan 2012. Does it mean that Oracle immediately analyses a column when it is added ?
I developed this hunch while trying to know when a particular column was added to a table.

Thanks.
Tom Kyte
January 25, 2012 - 10:35 am UTC

we do not track such minutiae. You would have to enable auditing or use a DDL event trigger to track such events.

Loz, January 26, 2012 - 5:06 pm UTC

Surely you meant to say
"refer to your change control preocedure and documentation"
:)

How new bytes are stored

Rashid, January 26, 2012 - 9:38 pm UTC

'.......if the last N columns of a given row are NULL - we use zero bytes of storage for them.......'

Hi Tom,

what happens in case we have millions of records in table and 80% of data-rows where last 10 columns are NULL. So we used zero bytes for storage.
Now when we update few rows to have proper values in those columns. what happens when extra bytes are needed to store these values,
does the following data rows are shifted ?
Tom Kyte
January 31, 2012 - 3:24 pm UTC

It will depend on whether the rows still fit on the block or not.

When you create a table, by default the pctfree will be 10% - meaning, we'll insert into that block until there is 10% of the block only left free - then we stop inserting.

The 10% is left behind for future updates. As long as the aggregate size of the rows stored on the block do not increase more than 10% - everything will stay more or less "put" - where it is.

But - what if you increase the size of the rows on a given block so that they need more space altogether than is available on that one block? Now you have to move (we call it migrating) some of the rows - since they won't fit anymore.

so, what we do is find another block where that updated row will in fact fit and put it there. On the original block - we leave behind a forwarding address (rowid) so that anyone that comes to that block looking for the row will know it moved (migrated). This allows us to skip updating every index (which has the rows rowid in it) when this happens.

The downside however is when you access this row later via an index - you'll have the original rowid and will have to go to that block to read the forwarding address to go and read another block. You'll have to do two table accesses to retrieve the row instead of just one.


In the future, if you *know* this is going to happen, you should plan ahead by setting the pctfree on this segment higher when you create it - that way you'll avoid the operation of migrating the row.


Thomas, February 06, 2012 - 3:46 am UTC

How does Oracle handle a row update if the row requires more space than before and the block as enough space left?

Am I right to assume that the whole block data can be rearranged (moving row start offsets around) when one or more rows of a block are updated, as long as the relative row order is left unchanged?

Tom Kyte
February 06, 2012 - 8:10 am UTC

correct, we may coalesce the block in order to move all of the free space together - making room for the row. There is a row directory in the block header pointing to each row.

read http://docs.oracle.com/cd/E11882_01/server.112/e25789/logical.htm#CIHEIFJC for more details, it covers this.

stats on columns access

shubi, February 06, 2012 - 2:53 pm UTC

Is there a way to know which column is most access/filter
so i can organize my table accordingly?
Tom Kyte
February 06, 2012 - 9:05 pm UTC

But in general, the difference is so tiny, so unmeasurable in real life that I would not worry about it. Put the columns in the order that makes the most sense.


please take that to heart. there are so many other much more important things to worry about in life.