Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Dermot.

Asked: October 07, 2020 - 7:32 am UTC

Last updated: October 07, 2020 - 2:13 pm UTC

Version: 12.c SE

Viewed 1000+ times

You Asked

We're considering the value of using GTT table approach to address a variety of 'staging table' requirements we have at our applications. We typically load multiple Mb of data ( 50-100 Mb per frequent, concurrent data load ) into staging tables from where we transform, validate and consolidate data into persisting portfolio data tables and then delete the individual load specific data from the staging table(s).

We expect that replacing our staging tables with equivalent GTT's and indexes etc will remove the overhead of deleting staged data at the end of each data load process. We understand also, although we haven't found confirmation at Oracle docs, that GTT's will use PGA space where possible before spilling data over into Temp tablespace if PGA is depleted. If our understanding here is not misguided then we might also consider splitting our data loading flows into smaller chunks to make better use of available PGA memory ?

Our question really is - are we on the right track in terms of our understanding of making best possible use of GTT approach and are there any other key considerations or policy rules that you would encourage ?

thanks for your time....

dermot..

and Chris said...

The database stores data for temporary tables in temp segments in the temp tablespace.

If our understanding here is not misguided then we might also consider splitting our data loading flows into smaller chunks to make better use of available PGA memory ?

This sounds like a case of premature optimization - I don't see a reason to design your process to do this from the outset.

SQL processing is fastest when you have few statements which find/change all the necessary rows. So I'd start by processing all the data in one pass, only switching from this if you hit some edge cases where this isn't true.

All SQL processing that is unable to fit in memory will spill to temporary disk - this is common for large sorts, hash joins, and groupings. If, after writing the load process, you find it is using lots of temp which is making it too slow splitting it up may make sense.

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

GTT's

dermot, October 07, 2020 - 11:24 am UTC

Chris, thanks for your prompt response.
To be clear, are you suggesting that GTT's on creation never make use of available PGA space but are always created as a temp table in temp tablespace at the outset ?
Chris Saxon
October 07, 2020 - 2:11 pm UTC

When you insert into a GTT, the data are saved in temp segments.

Depending on your SQL statements, they might still use PGA (sorts, hash joins, etc. start by using PGA, spilling to temp when they run out), but the rows themselves are stored in temp.

Alternatively

Peter Nosko, October 07, 2020 - 11:53 am UTC

Have you considered loading from external tables?
Chris Saxon
October 07, 2020 - 2:13 pm UTC

OP doesn't say where the data comes from - it may be something other than flat files.

But external tables are worth looking into if that is the source.

GTT's

dermot, October 07, 2020 - 2:28 pm UTC

Thank you again chaps for your very useful advice,

dermot..

More to Explore

Design

New to good database design? Check out Chris Saxon's full fundamentals class.