Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question.

Asked: October 20, 2017 - 8:40 am UTC

Last updated: October 22, 2017 - 9:25 am UTC

Version: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

Viewed 10K+ times! This question is

You Asked

Hi,

I'm trying to improve the performances of some queries in my production environment.

The current scenario is:

1) retrieve some data from "ParentTable". The query, in software code, is composed dynamically, based on the specified search criteria (yes, I know this is bad, that's way I'm changing it ;-) ). So the query changes if different search criteria are specified:

SELECT parentField1, parendField2
  FROM ParentTable
 WHERE column1 = :paramSearch1 --added dynamically only if needed
   AND column2 = :paramSearch2 --added dynamically only if needed
   ...
   AND columnX = :paramSearchX --added dynamically only if needed


This query can return a lot of records (even over one million)

2) The results of this query are used to perform a set of queries on other tables to retrieve related entities. Assuming that the previous query has returned 2500 records, the following queries are (dynamically) built to retrieve the required entity data:

--FIRST CHILD TABLE
SELECT *
  FROM MyChildTable1
 WHERE ((childCol1,chialdCol2) IN ((:parentValueCol1_1,:parentValueCol2_1), (:parentValueCol1_2,:parentValueCol2_2), ..., (:parentValueCol1_1000,:parentValueCol2_1000)))
       OR  ((childCol1,chialdCol2) IN ((:parentValueCol1_1001,:parentValueCol2_1001), (:parentValueCol1_1002,:parentValueCol2_1002), ..., (:parentValueCol1_2000,:parentValueCol2_2000)))
       OR ((childCol1,chialdCol2) IN ((:parentValueCol1_2001,:parentValueCol2_2001), (:parentValueCol1_2002,:parentValueCol2_2002), ..., (:parentValueCol1_2500,:parentValueCol2_2500)))

--SECOND CHILD TABLE
SELECT *
  FROM MyChildTable2
 WHERE ((childCol1,chialdCol2) IN ((:parentValueCol1_1,:parentValueCol2_1), (:parentValueCol1_2,:parentValueCol2_2), ..., (:parentValueCol1_1000,:parentValueCol2_1000)))
       OR  ((childCol1,chialdCol2) IN ((:parentValueCol1_1001,:parentValueCol2_1001), (:parentValueCol1_1002,:parentValueCol2_1002), ..., (:parentValueCol1_2000,:parentValueCol2_2000)))
       OR ((childCol1,chialdCol2) IN ((:parentValueCol1_2001,:parentValueCol2_2001), (:parentValueCol1_2002,:parentValueCol2_2002), ..., (:parentValueCol1_2500,:parentValueCol2_2500)))
    
-- AND SO ON, FOR NEEDED CHILD TABLES
...


The OR conditions are added dynamically splitting them into blocks of 1000 (the limit imposed by the IN clause). So, if the number of records retrieved by the first query changes, the query changes and Oracle "does not recognize" the query, thwarting any optimizations obtained from previous runs. (Am I wrong on this point???). Furthermore, the queries are not very fast... ;-)

So, with the optimization I'm making, I'm rewriting the queries so that the query is always the same, regardless of the search criteria used in the first query and, for subsequent queries, regardless of the number of results returning from the first query.

The first query has became:

SELECT parentField1, parendField2
  FROM ParentTable
 WHERE (:paramSearch1 IS NULL OR :paramSearch1 = column1)
   AND (:paramSearch2 IS NULL OR :paramSearch2 = column2)
   ...
   AND (:paramSearchX IS NULL OR :paramSearchX = columnX)


In this way the query it's always the same, so Oracle will choose the necessary optimizations as this query will be used over time. (Is it right?)

The other queries, to achieve the same goal and thus avoid dynamically concatenating the various conditions of OR, have been redefined in this way:

first concatenate the results obtained from the first query:

  ('parentValueCol1_1', 'parentValueCol2_1')#('parentValueCol1_2', 'parentValueCol2_2')#...#('parentValueCol1_2500', 'parentValueCol2_2500')


and then I pass this string as parameter to the query:

  SELECT *
  FROM MyChildTable1
  WHERE    :concatList IS NULL
       OR ('(''' || childCol1 || ''',' || childCol2 || ')' IN (    SELECT REGEXP_SUBSTR (:concatList
                                                                                                   , '[^#]+'
                                                                                                   , 1
                                                                                                   , LEVEL)
                                                                                 FROM DUAL
                                                                           CONNECT BY REGEXP_SUBSTR (:concatList
                                                                                                   , '[^#]+'
                                                                                                   , 1
                                                                                                   , LEVEL)
                                                                                         IS NOT NULL));



The query works... but, of course, now the limit is on the length of concatList parameter, limited to 4000 (and I'm not sure that the performances are better than previous version of the query... :-) ). In this case I should split the concatenated string, and dynamically add the OR conditions. But, of course, the initial problem reappears.

Then:
1) Is it correct what I think about always executing the same parameterized query, and that Oracle optimizes its execution as it runs over time?
2) Is the "tautology technique" the best approach for first query?
3) Is there a solution regarding the implementation of the second query, and the limit on the concatenated string?
4) Other solutions, or strategies to follow, to solve the whole problem?

Thanks in advance.

and Connor said...

1) Is it correct what I think about always executing the same parameterized query, and that Oracle optimizes its execution as it runs over time?

Not really. For adhoc search, where you want to give the optimizer the absolutely best information to work with, I would be staying with your initial approach, ie, only add predicates as required. (I'm assuming here you are not running these query hundreds of times per second).

2) Is the "tautology technique" the best approach for first query?

See above.

3) Is there a solution regarding the implementation of the second query, and the limit on the concatenated string?
4) Other solutions, or strategies to follow, to solve the whole problem?

If the number of rows returned from query1 is arbitrary, I think concatenated strings or similar is a bad bad idea. Doesn't scale and hits limits as you've already seen.

I would store the results in a global temporary table. So your first query becomes:

insert my gtt
select parentField1
...

and your followup query is:

select *
from my_table
where parent_id in ( select parentField1 from my_gtt );

In 12.2 you can take advantage of some new features with gtt's and statistics at session level, but until then you can customize the solution (if needed) to add explicit hints to govern the execution of the 2nd query based on the number of rows in the gtt (which you'll know directly from the insert via sql%rowcount);

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, October 23, 2017 - 1:20 pm UTC

Thanx for your reply.

With regard to the first point, the initial query may be run by many connected users simultaneously (but each one may have its own search parameters, generating several simultaneous queries).

For followup queries, I will try to apply your suggestion to use a Global Temporary Table.

Thank you.

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.