Hi Team
DB Version: 12.2.0.0.3
On a high level, on successful completion of the scripts (from LiveSQL) you would have:
1) Loaded 10k users data in JSON format in STAGE_USER table.
2) Loaded 500 roles data in JSON format in STAGE_ROLE table.
1) Loaded 10k user-role membership data in JSON format in STAGE_USERMEM table. Each user is assgined 100 roles.
[Note] ==> You might receive "ORA-06502: PL/SQL: numeric or value error" during Statement#9,19 and 29 executions, which converts the relational data to JSON format. Please ignore the error, the data will be loaded into the respective tables.
Gathered stats on all the three staging tables created post successful execution of the script.
EXEC dbms_stats.gather_table_stats(USER, '<TABLE_NAME>');
Below are the two use cases for which I see delay in response time with the JSON_TABLE operator being used:
1) Find total # of users assigned role 'DummyRole1' in the membership table
Avg response time of below SQL: ~3.9 secs.
With data increased to 100k in this membership table, the avg response time increases to ~20 secs.
SELECT COUNT(1)
FROM STAGE_USERMEM o,
json_table(userrolemem_json, '$'
COLUMNS
(
NESTED PATH '$.user_mem_dtls.user_role_mem[*]'
COLUMNS (role_id VARCHAR2(32 CHAR) PATH '$.role_id',
role_name VARCHAR2(20 CHAR) PATH '$.role_name')
)
) t
WHERE t.role_name='DummyRole1';
2) Find total no of roles assigned to each of the users in the membership table
Avg response time of below SQL: ~4.5 secs.
With data increased to 100k in this membership table, the avg response time increases to ~30 secs.
SELECT t.user_id,COUNT(1)
FROM STAGE_USERMEM o,
json_table(userrolemem_json, '$'
COLUMNS
(
user_id VARCHAR2(32 CHAR) PATH '$.user_id',
NESTED PATH '$.user_mem_dtls.user_role_mem[*]'
COLUMNS (role_id VARCHAR2(32 CHAR) PATH '$.role_id',
role_name VARCHAR2(20 CHAR) PATH '$.role_name')
)
) t
GROUP BY t.user_id;
Could you suggest how to speed up the response time for the above two use cases, as when load increases the response time shoots up.
Thanks in advance!
Are the sample data representative of your real data?
DummyRole1 exists in every JSON document. So you're returning every row from the table.
If this is just a "for example", and in your production data each role only appears in a "few" documents, you may get some benefits from creating a JSON search index.
To do this, you first need to add the "is json" check constraint to the column. Then add the index:
alter table stage_usermem add constraint j check ( userrolemem_json is json );
create search index stage_usermem_ji on stage_usermem (userrolemem_json) for json;
But if roles appear in most/all the documents, adding this will give little or no benefit:
set serveroutput off
SELECT /*+ gather_plan_statistics */COUNT(1)
FROM STAGE_USERMEM o,
json_table(userrolemem_json, '$'
COLUMNS
(
NESTED PATH '$.user_mem_dtls.user_role_mem[*]'
COLUMNS (role_id VARCHAR2(32 CHAR) PATH '$.role_id',
role_name VARCHAR2(20 CHAR) PATH '$.role_name')
)
) t
WHERE t.role_name='DummyRole1';
select * from table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
Plan hash value: 2625201623
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.97 | 20134 | 20000 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.97 | 20134 | 20000 |
| 2 | NESTED LOOPS | | 1 | 10000 | 10000 |00:00:02.01 | 20134 | 20000 |
| 3 | TABLE ACCESS FULL | STAGE_USERMEM | 1 | 10000 | 10000 |00:00:00.02 | 134 | 0 |
| 4 | JSONTABLE EVALUATION | | 10000 | | 10000 |00:00:01.93 | 20000 | 20000 |
------------------------------------------------------------------------------------------------------------
Note
-----
- statistics feedback used for this statement
alter table stage_usermem add constraint j check ( userrolemem_json is json );
create search index stage_usermem_ji on stage_usermem (userrolemem_json) for json;
SELECT /*+ gather_plan_statistics */COUNT(1)
FROM STAGE_USERMEM o,
json_table(userrolemem_json, '$'
COLUMNS
(
NESTED PATH '$.user_mem_dtls.user_role_mem[*]'
COLUMNS (role_id VARCHAR2(32 CHAR) PATH '$.role_id',
role_name VARCHAR2(20 CHAR) PATH '$.role_name')
)
) t
WHERE t.role_name='DummyRole1';
select * from table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
Plan hash value: 3790423225
---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:02.07 | 21148 | 20877 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:02.07 | 21148 | 20877 |
| 2 | NESTED LOOPS | | 1 | 408 | 10000 |00:00:01.96 | 21148 | 20877 |
| 3 | TABLE ACCESS BY INDEX ROWID| STAGE_USERMEM | 1 | 5 | 10000 |00:00:00.06 | 1147 | 877 |
|* 4 | DOMAIN INDEX | STAGE_USERMEM_JI | 1 | | 10000 |00:00:00.03 | 1020 | 877 |
| 5 | JSONTABLE EVALUATION | | 10000 | | 10000 |00:00:01.84 | 20001 | 20000 |
---------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CTXSYS"."CONTAINS"("O"."USERROLEMEM_JSON",'{DummyRole1} INPATH
(/user_mem_dtls/user_role_mem/role_name)')>0)
Notice that the value in the buffers column increases when using the index. So this makes the query do more work!
So if roles are common, you may be better suited to creating a materialized view over your query. Then querying that. For example:
create materialized view json_mv
as
SELECT t.role_name, COUNT(1)
FROM STAGE_USERMEM o,
json_table(userrolemem_json, '$'
COLUMNS
(
NESTED PATH '$.user_mem_dtls.user_role_mem[*]'
COLUMNS (role_id VARCHAR2(32 CHAR) PATH '$.role_id',
role_name VARCHAR2(20 CHAR) PATH '$.role_name')
)
) t
group by t.role_name;
select /*+ gather_plan_statistics */* from json_mv
where role_name = 'DummyRole1';
select * from table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
Plan hash value: 445572365
------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 9 |
|* 1 | MAT_VIEW ACCESS FULL| JSON_MV | 1 | 1 | 1 |00:00:00.01 | 9 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ROLE_NAME"='DummyRole1')
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Notice the buffers (work) has dropped drastically. You should see much quicker execution times too.
And of course, it's worth asking:
Do you
need to store the data as JSON? Can you use regular relational tables instead?
This makes indexing more straightforward.
And, if you look at the original plans closely, you'll see that most of the time goes on the "JSONTABLE EVALUATION" operation. i.e. converting your data from JSON to relational is where most of the time goes. Get rid of this and you could make your queries much faster.