You Asked

Hi,

I'm trying to write a stored proc which takes in any SQL query (select * from ...), and dumps the fetched rows into another table (DYNAMIC_QUERY_TABLE) for quick access, i.e, once this stored proc is run, another external application (like Java) will query this table to fetch the rows.
The stored proc is as follows:

create or replace procedure Query_To_Table(sql_query IN VARCHAR2)
AS
query VARCHAR2(10000);
Begin
--drop table if exists
FOR i IN (SELECT null FROM user_tables WHERE table_name = 'DYNAMIC_QUERY_TABLE') LOOP
      EXECUTE IMMEDIATE 'DROP TABLE DYNAMIC_QUERY_TABLE';
END LOOP; 

-- Dump the data form the dynamic query into a table
query := 'CREATE TABLE DYNAMIC_QUERY_TABLE AS ' || sql_query;
EXECUTE IMMEDIATE query;

END Query_To_Table;

I want to run statistics on the input query "sql_query". Statistics as in, the CPU cost, approx execution time, approx IO time to fetch all the rows for the query.

I have two questions:
1. Is this way of fetching data and dumping into a table efficient in terms of performance?
The input query can be very complex - containing many joins on many tables.
I chose this method because doing DB operations in a stored proc is better than doing from an external application connecting to DB.
Is there any other way of achieving this?
2. The statistics that I want to run on the query has to be done before it is actually executed to fetch the data and dump into DYNAMIC_QUERY_TABLE. How can I achieve this? The table "v$sql" provides this info, but it only provides data for executed queries.

Thanks
Regards
Aravind

and Chris said...

This is a terrible idea. Just run the query!

Not only does dropping and creating the table add more work, it causes all sorts of concurrency problems. What happens when two people try and run this process at the same time?

You risk the first getting "table or view does not exist" and other errors. That's bad!

I'm not sure what you're trying to achieve by storing statistics such as "the CPU cost, approx execution time, approx IO time" etc. If you capture these for the source query, they have no relevance to fetching from dynamic_query_table. The first has to do all your joining, grouping, etc. The second just full scans a table.

There's also no general way to find how long it takes a query to run. Other than running it!

If you're storing this data because your app needs to fetch the results many times, instead of doing this you could create a materialized view. This stores the result of your query. If the SQL is simple enough, you can make it "fast refresh on commit". and Oracle Database will keep it in sync when you change data.

If not, you can schedule refreshes.

You can then point your app at the MV to get the data. This is similar in effect to using the dynamic_query_table. But without all the drop/create nonsense!

Rating

(2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Aravind R, August 14, 2017 - 10:51 am UTC

Hello Chris,

Maybe I didn't convey my query properly..
For the 1st question,
It can be any random query trying to fetch the data from DB. My aim is to find a most efficient way to provide the data to the calling application in very less time. Even I had thought about the repercussions of multiple calls to the mentioned procedure with different queries. I really hadn't found a solution for that yet. Any suggestions are welcome.

For the 2nd question,
Before I actually run the query to fetch the data, i want to analyze the burden of executing the query on CPU, i.e, I want to know what would be the CPU cost, execution time etc.
There will be multiple tables with huge amount of data in it. I don't want to run any query which takes more than a specified amount of time or which consumes a lot of CPU processing power. If such queries are passed to the stored proc, i have to send a message to the calling application saying that the query cannot be run.

Also, I don't want to store any statistics data into any table.

Hope this conveys my problem more clearly.

Thanks
Regards
Aravind

August 14, 2017 - 2:21 pm UTC

The answer's still the same: just run the query! Any kind of staging you do is more work, therefore must take at least as long as just doing it in the first place. Possibly a lot longer.

On the second point, you still need to run the query to get this information. There is no general way to determine these figures other than running the query. And of course, they're likely to change each time, depending on changes to the data, what else the server is doing, etc.

If you're trying to stop run away queries, or need to manage system resources carefully, use the Database Resource Manager:

http://docs.oracle.com/database/122/ADMIN/managing-resources-with-oracle-database-resource-manager.htm#ADMIN027

This can automatically kill queries that take too long, use too much CPU, IO, etc.

To Aravind : another possibility

J. Laurindo Chiappa, August 15, 2017 - 5:14 pm UTC

Hello, Aravind : regarding Connor´s answer, I must second that - the only 100% guaranteed method to get precise measures would be run the query, no doubts.... What I could point to you is the possibility of get an Estimation of the plan via EXPLAIN PLAN - this works even with dynamic SQLs, see :

btest@JLCent:JLCent-DESENV:SQL>DECLARE
2 V_SQL varchar2(500) := 'SELECT * FROM TAB_LOG_OF_SYSTEM WHERE 1>2';
3 V_STATEMENT_ID varchar2(25) := 'TEST#2';
4 BEGIN
5 EXECUTE IMMEDIATE 'EXPLAIN PLAN SET STATEMENT_ID=' || CHR(39) || V_STATEMENT_ID || CHR(39) || ' FOR '|| v_SQL ||'';
6 END;
7 /

Procedimento PL/SQL concluído com sucesso.

btest@JLCent:JLCent-DESENV:SQL>SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE', 'TEST#2', 'ALL'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1470043750

--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 93 | 0 (0)| |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL|TAB_LOG_OF_SYSTEM| 17M| 1568M| 53299 (1)| 00:10:40 |
--------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$1
2 - SEL$1 / TAB_LOG_OF_SYSTEM@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(NULL IS NOT NULL)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - "TAB_LOG_OF_SYSTEM"."TABLE_NAME"[VARCHAR2,30],
"TAB_LOG_OF_SYSTEM"."USUARIO"[VARCHAR2,30],
"TAB_LOG_OF_SYSTEM"."DATAHORA"[DATE,7], "TAB_LOG_OF_SYSTEM"."ID"[NUMBER,22],
"TAB_LOG_OF_SYSTEM"."LOG"[VARCHAR2,1000]
2 - "TAB_LOG_OF_SYSTEM"."TABLE_NAME"[VARCHAR2,30],
"TAB_LOG_OF_SYSTEM"."USUARIO"[VARCHAR2,30],
"TAB_LOG_OF_SYSTEM"."DATAHORA"[DATE,7], "TAB_LOG_OF_SYSTEM"."ID"[NUMBER,22],
"TAB_LOG_OF_SYSTEM"."LOG"[VARCHAR2,1000]

32 linhas selecionadas.

btest@JLCent:JLCent-DESENV:SQL>

You could read the PLAN_TABLE (or even the output of DBMS_XPLAN) in your procedure to find number of rows, time estimated and things like that and act as required... But BE WARNED : EXPLAIN PLAN estimatives are just a (more or less) educated guess, it can be wrong for a LARGE SCALE....

Due to this I despise on-the-fly queries very very much (in my world, if the data volume is large, the SQL code ** MUST ** be contructed very very carefully, you WANT and NEED a code review, etc), so I´m afraid that your idea will be doomed as soon it hits Production, but the decision will be yours...

At the end , if you WANT to give to the users the power of running any SQL (urgh!), your options to reclaim some control for you would be :

a. let the queries run BUT have some maximum resource usage limits, just like Connor said

or

b. estimate the code received (via EXPLAIN PLAN, via automatized code reviews - ie, some IFs checking if key columns supplied suffice, etc) and try to reject the execution of queries considered to be of 'bad quality'... OF COURSE, false positives and misses are ALWAYS a strong possibility...

Regards,

J. Laurindo Chiappa

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

To Aravind : another possibility

More to Explore

Administration

Dynamic Query Execution Cost

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

To Aravind : another possibility

More to Explore

Administration