Skip to Main Content
  • Questions
  • Options to quickly access large portions of rows

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, Mikhail.

Asked: August 05, 2020 - 6:56 am UTC

Last updated: August 11, 2020 - 2:20 pm UTC

Version: 12.1.0.2

Viewed 1000+ times

You Asked

Hello, Tom.

We have a fact table, that is partitioned by day and stores the last 90 days of data. Sometimes users of the application can change the status of record from 'ACTIVE' to 'CANCELED'.

There are a lot of heavy analytical queries against that table that include full scans but only consider the 'ACTIVE' records. The number of 'CANCELED' record can wary greatly over time, from 5% to 60%. Right now it has 37 million active ones, and 67 million canceled, so my full scan could be 3 times faster.

My question is: what is the best option to quickly access all the active records?

B-tree index won't help, because there are too many rows to retrieve.

Bitmap index seems to be a bad choice, since there are a lot of DML operations.

I wanted to try subpartitioning by list and move the rows to the 'CANCELED' subpartition, but I immediately have concerns:

There are 7 indexes on the table now. Moving a lot of rows between sections would require a lot of time and could potentially fill up the undo if someone decides to change the status of tens of millions of rows at a time(users can do and will do that).
Since the table is partitioned by day, any blank space left after row movent in sections older than today won't be reused or reclaimed and a full scan will take just a much time. That makes the whole idea almost useless.
I am afraid that shrinking the entire table could fill up the undo segment.

I don't have an environment close in specs to our PROD environment, so I can't even really test my concerns with undo.
Unfortunately we can't upgrade to 12.2 for a few more months, so move online is not availbable.

Is there another option that I am missing or should I just run shrink space partition by partition on a daily basis?

and Chris said...

Given that nobody's looking at millions of rows, I'm assuming that the queries are some form of aggregation, e.g.
select count/sum/max/etc, ...
from ...
group by ...


In which case I'd be looking at materialized views or Database In-Memory.

The advantage of materialized views is they're included in your license. The downside is to use them (either automatically via query rewrite or using them directly) you must be able to derive the results of your query from the MV. This may mean you need lots of these to cover key queries. As you have processes that can change the status of millions of rows at once, refreshing these may also be a challenge.

The downside of Database In-Memory is an extra cost option (though from 19.8 you can allocate up to 16Gb at no extra cost). The upside is all queries using the In-Memory tables can take advantage of this.

Bitmap index seems to be a bad choice, since there are a lot of DML operations.

This is only an issue if you have concurrent writes.

Moving a lot of rows between sections would require a lot of time and could potentially fill up the undo if someone decides to change the status of tens of millions of rows at a time(users can do and will do that).
...
I don't have an environment close in specs to our PROD environment, so I can't even really test my concerns with undo.


Why not? Provided you can create a table of the same size and similar data, you can measure the amount of undo this creates. Assuming that the production kit is faster, it also gives you a reasonable worst case for time.

You may also want to look at compressing the table, depending on your data this may reduce the size considerably.

Ultimately whatever you do will involve some kind of trade-off. You need to test to see what impact various actions have and compare the costs/benefits.

But rather than asking abstractly "how do I make full table scans faster?", look at specific queries and see how you can make them faster.

Is this answer out of date? If it is, please let us know via a Comment

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.