Skip to Main Content
  • Questions
  • How to Optimize design a 1000+ columns in a Oracle DB

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, raj.

Asked: March 20, 2018 - 7:47 pm UTC

Last updated: March 28, 2018 - 1:48 am UTC

Version: 12c

Viewed 10K+ times! This question is

You Asked

Hi,

We need to store 5 minute interval data, with additional attributes, which will be around 1000+ columns. How to design this optimally for a Fact Table? Doe we have any columnar design in Oracle, such as a vertical split? Please provide your expert advice on how to handle this with optimal performance.

Many Thanks
R

and Chris said...

There's a hard limit of 1,000 columns per table in Oracle Database. So you have to split it into many tables.

It's also worth bearing in mind the maximum size of a row piece is 255 columns. Once you exceed this you can run into some issues:

https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:9530992000346894870
https://jonathanlewis.wordpress.com/2015/02/19/255-columns/
https://jonathanlewis.wordpress.com/2017/05/19/255-columns-2/
https://jonathanlewis.wordpress.com/2018/02/28/255-columns-3/

But if you split it into many tables each with <= 255 columns, you'll have more joins. So ultimately you need to test your application to see how these trade-offs affect you.

If you can provide more details on what you're doing and specific problems you're hitting add it as a review. And we'll see how we can help.

Rating

  (5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, March 22, 2018 - 4:26 am UTC

Thank you. Can you please send me the optimal sample model design for a split table, which has more than 255+ columns.

Thx
R
Chris Saxon
March 22, 2018 - 5:15 pm UTC

Can you please send us the complete business requirements, attributes you're storing and details of the queries you'll run?

A reader, March 23, 2018 - 4:09 am UTC

OK. Our requirement is to store variety on intervals coming in a 24 hour period in a single table. That includes some meters sending 5 min, 15 min and 60 min. So we will have around 288 Columns for intervals along with aggregate daily columns, such as avg daily interval, total 24hr interval, peak 24hr interval and so forth...

Again, this is for reporting, users wanted to see this at 5 min or only 15 min or only 60 min based on their need. They are used to seeing it as a horizontal view, when they do select of the meterid along with the relevant intervals (columns).

Please advice on the best possible design.. Also as per the previous recommendation can you refer me a sample split design.

Thx
Chris Saxon
March 26, 2018 - 10:33 am UTC

There is no "sample split design". What's best depends on your requirements.

But.

Do you really need each interval to be a column?

Surely each should be a row in the table?

Not only does this avoid the whole "how do we split up the columns" issue, it's more flexible. If it turns out you need 10min intervals too, you insert these. No DDL needed.

And could you calculate the averages, peaks and so on at runtime?

If these queries are too slow, you could create materialized views to store the results.

Finally if you need to display intervals as columns, you can use unpivot.

Screen size recommendations?

Stew Ashton, March 26, 2018 - 1:58 pm UTC

What size screen do we need to see those 1000 columns all at once?

Thanks in advance.
Chris Saxon
March 26, 2018 - 2:19 pm UTC

Set the font size to the smallest possible and only show the first character from each column and I'm sure they'll all display on my phone ;)

Guessification

Jonathan Lewis, March 27, 2018 - 11:42 am UTC

Taking a wild stab at requirements -

Given 5, 15, and 60 minute intervals that's 288 + 72 + 24 = 384 intervals per day. At 3 readings per interval that's 1152 values per interval - which might be why the request is 1000+ columns to get one day packed as tightly as possible

For physical colocation, clean design, avoiding all the disasters of having 1,000 columns etc. - I think this looks like a good case of using an IOT:

(meter_id, reading_date_midnight, interval_type, interval_id, val1, val2, val3).

PK is first 4 columns - compress on 1st 3.
Interval_type is '5', '15', '60' (or equivalent, constrained)
Interval_id is 1-288, 1-72, 1-24 - could use table constraint to validate type against maximum id.

Almost guaranteed that all the data for a given meter on a given day will fall inside one or two leaf blocks.

Main drawback: over time the index will be significantly larger than it needs to be thanks to 50/50 leaf block splits, though.

Irritation: the requirement invites as range/hash partitioning solution on date/meter, but range/hash is not allowed for IOTs



Chris Saxon
March 27, 2018 - 1:53 pm UTC

Good suggestions, thanks for stopping by Jonathan.

A reader, March 27, 2018 - 8:39 pm UTC

Hi Jonathan,

Thanks you.

My current design is as below

(meter_id, reading_date_midnight, interval_type, interval_id, interval_value).

not sure why we need attributes for multiple values like
val1, val2, val3).....???


Connor McDonald
March 28, 2018 - 1:48 am UTC

It was based on as assumption of 3 readings per interval

More to Explore

Design

New to good database design? Check out Chris Saxon's full fundamentals class.