Skip to Main Content
  • Questions
  • Using SYS_GUID() has primary key in large table

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Kessy.

Asked: February 08, 2018 - 12:35 am UTC

Last updated: February 13, 2018 - 12:47 am UTC

Version: 11g

Viewed 1000+ times

You Asked

Hey,

I have a large table 600 million rows, approximately 70gb, not partitioned (yet), a few indices, etc.

I have reports that query this table often but they're very slow (20-60 minutes at times). I want to create this materialized view but I can't because my base table doesn't have a primary key

This is what I want to do:

CREATE MATERIALIZED VIEW LOG ON LLATTRDATA;

CREATE MATERIALIZED VIEW MV_LLATTRDATA_TEST1
    NOLOGGING
    CACHE
    BUILD IMMEDIATE 
    REFRESH FAST ON COMMIT 
    ENABLE QUERY REWRITE
    AS 
        SELECT *
        FROM LLATTRDATA D
        WHERE 
            (D.DEFID = 3070056 AND D.ATTRID IN (2, 3, 4)) OR
            (D.DEFID = 3070055 AND D.ATTRID IN (3, 30, 31, 2, 24, 23, 4)) OR
            (D.DEFID = 3071379 AND D.ATTRID IN (3, 5, 8)) OR
            (D.DEFID = 3072256 AND D.ATTRID = 5);
            
BEGIN
  DBMS_STATS.gather_table_stats(
    ownname => 'CHEACOCK',
    tabname => 'MV_LLATTRDATA_TEST1');

END;




There's an ID field but it's not unique and no other column would server well as a Primary Key.

So I was thinking of using SYS_GUID on a new column and declaring that the Primary Key:

ALTER TABLE LLATTRDATA ADD PKID RAW(16);
ALTER TABLE LLATTRDATA ADD CONSTRAINT PK_LLATTRDATA PRIMARY KEY(PKID);
UPDATE LLATTRDATA SET PKID = SYS_GUID()


Would this make sense or is there some considerations I'm not taking into account?

and Connor said...

I'd be perhaps more tempted to move toward a partitioning model before heading down the mview path. I don't have enough information to make a recommendation, but *for example*, hash partitioning on DEFID might give sufficient data distribution and granularity without adding columns or mviews.


As as aside, you don't want to be updating 600m rows. It would take a long time, and you'd probably need to perform a reorganization of the table afterwards anyway because every row will grow in size - you might get a big row migration performance issue.

You would probably use DBMS_REDEFINITION to do the work, which minimizes any outage, eg

SQL> create table t as select * from dba_objects where object_id is not null;

Table created.

SQL> exec dbms_redefinition.can_redef_table(user, 'T', dbms_redefinition.cons_use_rowid);

PL/SQL procedure successfully completed.

SQL>
SQL> create table t2 as
  2  select *
  3  from   t where 1=2;

Table created.

SQL>
SQL> alter table t2 add pk raw(16) default sys_guid();

Table altered.

SQL>
SQL> exec dbms_redefinition.start_redef_table(user, 'T', 'T2',options_flag=>dbms_redefinition.cons_use_rowid);

PL/SQL procedure successfully completed.

SQL>
SQL> alter table t2 add (constraint t2_pk primary key (pk));

Table altered.

SQL>
SQL> exec dbms_redefinition.finish_redef_table(user, 'T', 'T2');

PL/SQL procedure successfully completed.



You can also use the same technique to partition the table without downtime.

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Workarround if

Gh, February 08, 2018 - 9:06 am UTC

Is it an oltp or dwh?
Is it intensive updated table?

You could explore the virtual colunm. And index or bitmap index it.

E.g. alter .. add mycol as .. use the Case statement for your conditions : (D.DEFID = 3070056 AND D.ATTRID IN (2, 3, 4)) OR (D.DEFID = 3070055 AND D.ATTRID IN (3, 30, 31, 2, 24, 23, 4)) OR (D.DEFID = 3071379 AND D.ATTRID IN (3, 5, 8)) OR (D.DEFID = 3072256 AND D.ATTRID = 5);


And attribute say 1, 2, 3, or 4 else 0 for its value.
The as said index it .. tune.
Maybe it would resolve your issue.

Connor McDonald
February 09, 2018 - 2:19 am UTC

Good input.

using sequence and guids

Rajeshwaran, Jeyabal, February 09, 2018 - 12:45 pm UTC

Any specific reason to go with GUID as primary keys?

Sequence based primary keys look smart than guid based keys.

few details @ https://tinyurl.com/ydhlxl2l
Connor McDonald
February 09, 2018 - 1:29 pm UTC

I wouldn't go with either (for this case).

Although it boggles the mind there is not are existing column(s) that is not a natural key.

Muhammad Riaz, February 11, 2018 - 5:42 am UTC

From table name, it is clear that it is Opentext content management server environment. So prior to applying any change, it should be discussed with OT support to have their recommendations.

Anyways, perhaps it is the result set (number of records from query) that takes time. If your result set is good percentage of total number of records, adding MV may not be so much beneficial.

Also try to force query to use index and compare the time.

We have 150 million records in our environment but the result set is not very small percentage of total records.


SELECT count(*)
         FROM livelinkprod.LLATTRDATA D
         WHERE (defid=2327 and attrid in (8,4,5)) OR
  (defid=86124 and attrid in (8,7))

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     37.58      89.80    1174657    1174682          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     37.58      89.80    1174657    1174682          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 84  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=1174682 pr=1174657 pw=0 time=89802775 us)
    184465     184465     184465   TABLE ACCESS FULL LLATTRDATA (cr=1174682 pr=1174657 pw=0 time=1868695 us cost=305077 size=72443392 card=9055424)

********************************************************************************

SELECT /*+Rule */ count(*)
          FROM livelinkprod.LLATTRDATA D
          WHERE (defid=2327 and attrid in (8,4,5)) OR
   (defid=86124 and attrid in (8,7))

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.45       0.45          0       8000          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.45       0.45          0       8000          0           1

Misses in library cache during parse: 1
Optimizer mode: RULE
Parsing user id: 84  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=8000 pr=0 pw=0 time=454701 us)
    184465     184465     184465   CONCATENATION  (cr=8000 pr=0 pw=0 time=521746 us)
     48658      48658      48658    TABLE ACCESS BY INDEX ROWID LLATTRDATA (cr=3833 pr=0 pw=0 time=152688 us)
    218961     218961     218961     INDEX RANGE SCAN LLATTRDATA_DEFID (cr=584 pr=0 pw=0 time=105118 us)(object id 259463)
    135807     135807     135807    TABLE ACCESS BY INDEX ROWID LLATTRDATA (cr=4167 pr=0 pw=0 time=332064 us)
    452690     452690     452690     INDEX RANGE SCAN LLATTRDATA_DEFID (cr=1138 pr=0 pw=0 time=220036 us)(object id 259463)

Connor McDonald
February 13, 2018 - 12:47 am UTC

Great to have input from someone who actually has the structures in play.

Thanks very much.

Muhammad Riaz, February 11, 2018 - 5:45 am UTC

From table name, it is clear that it is Opentext content management server environment. So prior to applying any change, it should be discussed with OT support to have their recommendations.

Anyways, perhaps it is the result set (number of records from query) that takes time. If your result set is good percentage of total number of records, adding MV may not be so much beneficial.

Also try to force query to use index and compare the time.

We have 150 million records in our environment but the result set is not very small percentage of total records.


SELECT count(*)
         FROM livelinkprod.LLATTRDATA D
         WHERE (defid=2327 and attrid in (8,4,5)) OR
  (defid=86124 and attrid in (8,7))

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     37.58      89.80    1174657    1174682          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     37.58      89.80    1174657    1174682          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 84  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=1174682 pr=1174657 pw=0 time=89802775 us)
    184465     184465     184465   TABLE ACCESS FULL LLATTRDATA (cr=1174682 pr=1174657 pw=0 time=1868695 us cost=305077 size=72443392 card=9055424)

********************************************************************************

SELECT /*+Rule */ count(*)
          FROM livelinkprod.LLATTRDATA D
          WHERE (defid=2327 and attrid in (8,4,5)) OR
   (defid=86124 and attrid in (8,7))

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.45       0.45          0       8000          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.45       0.45          0       8000          0           1

Misses in library cache during parse: 1
Optimizer mode: RULE
Parsing user id: 84  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=8000 pr=0 pw=0 time=454701 us)
    184465     184465     184465   CONCATENATION  (cr=8000 pr=0 pw=0 time=521746 us)
     48658      48658      48658    TABLE ACCESS BY INDEX ROWID LLATTRDATA (cr=3833 pr=0 pw=0 time=152688 us)
    218961     218961     218961     INDEX RANGE SCAN LLATTRDATA_DEFID (cr=584 pr=0 pw=0 time=105118 us)(object id 259463)
    135807     135807     135807    TABLE ACCESS BY INDEX ROWID LLATTRDATA (cr=4167 pr=0 pw=0 time=332064 us)
    452690     452690     452690     INDEX RANGE SCAN LLATTRDATA_DEFID (cr=1138 pr=0 pw=0 time=220036 us)(object id 259463)

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database