Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Thippeswamy.

Asked: September 06, 2017 - 8:07 am UTC

Last updated: August 25, 2020 - 2:46 pm UTC

Version: 11.2.0.4.0

Viewed 10K+ times! This question is

You Asked

I have question on clustered index

I read from documents that whenever primary key is created, it creates clustered index along with it, and it sorts the rows in the table in the same order as the clustered index(on the actual disk), I didn’t understand the point sorts the rows in the table, does it mean the records in the file are sorted, if yes, physical address of the record should change, but that is not the case, I checked physical address for record remains same even after creating primary key constraint on the column.

Please help me to understand the concept.

and Connor said...

We dont sort the table - the table stays as is. In brief:

A table is (typically) a heap structure, ie, you add rows and they are scattered throughout storage. There is no particular or predefined order to the data. We stuff it wherever we like as fast as we like. Our focus is on performance and concurrency.

But obviously, if you want to have some structure/rules about that data, then primary keys and the like come into play. To check whether we are about to log a duplicate key is difficult in a heap structure - you would have to (a) lock the table, (b) scan the entire structure, because that duplicate could be *anywhere* in the structure. That's not scalable at all.

So we create an additional structure (an index) to sit alongside the table. (The table data is still as before, scattered). The index is a well-ordered structure so that when looking for a duplicate, I can quickly find the spot in the index where a duplicate entry *was* going to exist if it was present. That's much faster to check. But that comes with trade offs. I have to *maintain* that well-ordered structure every time I add data.

We use a B-tree structure to look after that. An excellent description of this can be found in our Concepts manual

https://docs.oracle.com/database/121/CNCPT/indexiot.htm#CNCPT1170

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Clustered index

Thippeswamy CR, September 07, 2017 - 5:03 am UTC

Thanks Tom For the replay, Indeed am Big fan this site.

I understand the concept of cluster index in oracle, but is true that in oracle that when primary key is created, it would create clustered index along with it..?

Clustered index created on the cluster, when primary key is not part of cluster(so no cluster key), what is point of having clustered index..?

I agree Clustered index is useful when dept id is shared among dept and employee and we have primary key of Dept table is being established as FK in Employee table.


Thippeswamy CR, September 07, 2017 - 5:43 am UTC

Thanks Tom For the replay, Indeed am Big fan this site.

I understand the concept of cluster index in oracle, but is true that in oracle that when primary key is created, it would create clustered index along with it(can you privide the link of authorised doc)..?

Clustered index created on the cluster, when primary key is not part of cluster(so no cluster key), what is point of having clustered index..?

I agree Clustered index is useful when dept id is shared among dept and employee and we have primary key of Dept table is being established as FK in Employee table.

Connor McDonald
September 14, 2017 - 3:41 am UTC

Sorry, I misread your initial question and just told you about normal tables and their indexes. For clusters, you must have a clustered index before you are allowed to "use them".

From the docs:

https://docs.oracle.com/cd/E11882_01/server.112/e40540/tablecls.htm#CNCPT88825

"An indexed cluster is a table cluster that uses an index to locate data. The cluster index is a B-tree index on the cluster key. A cluster index must be created before any rows can be inserted into clustered tables."

eg

SQL> CREATE CLUSTER employees_departments_cluster
  2     (department_id NUMBER(4))
  3  SIZE 512;

Cluster created.

SQL> CREATE TABLE employees ( department_id number(4), ename varchar2(10), empno int )
  2     CLUSTER employees_departments_cluster (department_id);

Table created.

SQL>
SQL> CREATE TABLE departments ( department_id number(4), dname varchar2(10), sales int )
  2     CLUSTER employees_departments_cluster (department_id);

Table created.


All looks until I try to add data

SQL> insert into departments values (1,'d1',1000);
insert into departments values (1,'d1',1000)
            *
ERROR at line 1:
ORA-02032: clustered tables cannot be used before the cluster index is built


So I need that index...

SQL> CREATE INDEX idx_emp_dept_cluster ON CLUSTER employees_departments_cluster;

Index created.

SQL> insert into departments values (1,'d1',1000);

1 row created.

SQL> insert into employees values (1,'emp123',123);

1 row created.

SQL> insert into employees values (1,'emp124',124);

1 row created.

SQL> insert into employees values (1,'emp125',125);

1 row created.

SQL>


Cluster Index

David Kurtz, August 25, 2020 - 11:45 am UTC

The point of clustering a table that rows with the same cluster key values will be inserted into the same data block (though they will go to a different block if no space available). So when you frequently query many rows with the same cluster key value, even from different tables, you only have to read a few data blocks, and there is a better chance of them being in the buffer cache. You can have multiple tables in the same cluster, and it is the cluster key index is used to identify the data block where those rows are located.

set autotrace off
drop cluster dept_cluster;
drop cluster dept;
drop table emp purge;
drop table dept purge;
create cluster dept_cluster(department_id number(2));
create index dept_cluster_idx ON cluster dept_cluster;


CREATE TABLE DEPT
(DEPARTMENT_ID NUMBER(2,0),
 DEPARTMENT_NAME VARCHAR2(14),
 LOCATION VARCHAR2(13),
 CONSTRAINT dept_PK PRIMARY KEY (DEPARTMENT_ID)
)
cluster dept_cluster (department_id);

CREATE TABLE emp
(EMPLOYEE_ID NUMBER(4,0),
 EMPLOYEE_NAME VARCHAR2(10),
 JOB VARCHAR2(9),
 MANAGER_ID NUMBER(4,0),
 HIREDATE DATE,
 SALARY NUMBER(7,2),
 COMMISSION NUMBER(7,2),
 DEPARTMENT_ID NUMBER(2,0),
 CONSTRAINT "EMP_PK" PRIMARY KEY ("EMPLOYEE_ID"),
 CONSTRAINT "EMP_DEPT_FK" FOREIGN KEY ("DEPARTMENT_ID") REFERENCES dept ("DEPARTMENT_ID") ENABLE
)
cluster dept_cluster (department_id);

create index emp_dept_fk_idx on emp (department_id);

insert into dept select * from departments;
insert into emp select * from employees;
commit;
exec dbms_stats.gather_table_stats(user,'EMP');
exec dbms_stats.gather_table_stats(user,'DEPT');

break on department_id skip 1
select department_id, department_name, rowid, dbms_rowid.rowid_block_number(rowid) block_number
from dept
order by block_number;
select employee_id, employee_name, department_id, rowid, dbms_rowid.rowid_block_number(rowid) block_number
from emp
order by block_number;


Note that rows from EMP and DEPT with the same cluster key (department ID) are in the same blocks

DEPARTMENT_ID DEPARTMENT_NAM ROWID              BLOCK_NUMBER
------------- -------------- ------------------ ------------
           10 ACCOUNTING     AAAS5BAAMAAAeZ8AAA       124540

           20 RESEARCH       AAAS5BAAMAAAeZ9AAA       124541

           30 SALES          AAAS5BAAMAAAeZ+AAA       124542

           40 OPERATIONS     AAAS5BAAMAAAeZ/AAB       124543


SQL>   2    3
EMPLOYEE_ID EMPLOYEE_N DEPARTMENT_ID ROWID              BLOCK_NUMBER
----------- ---------- ------------- ------------------ ------------
       7934 MILLER                10 AAAS5BAAMAAAeZ8AAC       124540
       7839 KING                     AAAS5BAAMAAAeZ8AAB       124540
       7782 CLARK                    AAAS5BAAMAAAeZ8AAA       124540

       7788 SCOTT                 20 AAAS5BAAMAAAeZ9AAC       124541
       7566 JONES                    AAAS5BAAMAAAeZ9AAB       124541
       7369 SMITH                    AAAS5BAAMAAAeZ9AAA       124541
       7876 ADAMS                    AAAS5BAAMAAAeZ9AAD       124541
       7902 FORD                     AAAS5BAAMAAAeZ9AAE       124541

       7499 ALLEN                 30 AAAS5BAAMAAAeZ+AAA       124542

EMPLOYEE_ID EMPLOYEE_N DEPARTMENT_ID ROWID              BLOCK_NUMBER
----------- ---------- ------------- ------------------ ------------
       7521 WARD                  30 AAAS5BAAMAAAeZ+AAB       124542
       7654 MARTIN                   AAAS5BAAMAAAeZ+AAC       124542
       7698 BLAKE                    AAAS5BAAMAAAeZ+AAD       124542
       7844 TURNER                   AAAS5BAAMAAAeZ+AAE       124542
       7900 JAMES                    AAAS5BAAMAAAeZ+AAF       124542


Note that Oracle used the cluster index not the FK index
set autotrace on pages 99 lines 120
select * from emp
where department_id = 10;

EMPLOYEE_ID EMPLOYEE_N JOB       MANAGER_ID HIREDATE      SALARY COMMISSION DEPARTMENT_ID
----------- ---------- --------- ---------- --------- ---------- ---------- -------------
       7782 CLARK      MANAGER         7839 09-JUN-81       2450                       10
       7839 KING       PRESIDENT            17-NOV-81       5000
       7934 MILLER     CLERK           7782 23-JAN-82       1300



Execution Plan
----------------------------------------------------------
Plan hash value: 2254702070

-----------------------------------------------------------------------------------------
| Id  | Operation            | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                  |     5 |   190 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS CLUSTER| EMP              |     5 |   190 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN  | DEPT_CLUSTER_IDX |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DEPARTMENT_ID"=10)



Chris Saxon
August 25, 2020 - 2:46 pm UTC

Good explanation

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.