Skip to Main Content
  • Questions
  • Natural Join and Index Monitoring (9iR2 Slide Show)

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, Jerry.

Asked: March 21, 2003 - 7:58 pm UTC

Last updated: May 23, 2013 - 3:02 pm UTC

Version: 9.2

Viewed 1000+ times

You Asked

I was able to go through the slide show on 9iR2 features today, and came away with 2 questions:

(1) I am not clear on why natural joins are a bad idea. I guess I can imagine some scenarios, but I would appreciate some details on the downsides of them.

(2) If there is one thing I've learned from this site (and there isn't only one), that's to try things out. I observe different behavior from the slide that says that index monitoring data is wiped out after a restart. It appears to stick around until the index is dropped, and the doc indicates that, too.

Grateful as always.

P.S. Never been disappointed with an Addison-Wesley book. They seem like a class outfit.

and Tom said...

1) they join by names of columns.

It is funny, there is a bug logged in the bug database against natural joins. Says "they get the wrong answer". sure enough, it would appear they do -- however, the bug is closed as "not a bug"

hr@ORA920> select employee_id, department_name
2 from employees natural join departments;

EMPLOYEE_ID DEPARTMENT_NAME
----------- ------------------------------
202 Marketing
115 Purchasing
...
113 Finance
206 Accounting

32 rows selected.

hr@ORA920> select employee_id, department_name
2 from employees, departments
3 where employees.department_id = departments.department_id;

EMPLOYEE_ID DEPARTMENT_NAME
----------- ------------------------------
200 Administration
202 Marketing
...
206 Accounting
205 Accounting

106 rows selected.

See -- obviously the natural join is broken right? (this is using the HR sample schema provided with Oracle9i). There are 32 rows using a natural join, yet 106 using a hand coded natural join. Whats up with that?

Well:

hr@ORA920> desc employees;
Name Null? Type
----------------------------------- -------- ------------------------
EMPLOYEE_ID NOT NULL NUMBER(6)
FIRST_NAME VARCHAR2(20)
LAST_NAME NOT NULL VARCHAR2(25)
EMAIL NOT NULL VARCHAR2(25)
PHONE_NUMBER VARCHAR2(20)
HIRE_DATE NOT NULL DATE
JOB_ID NOT NULL VARCHAR2(10)
SALARY NUMBER(8,2)
COMMISSION_PCT NUMBER(2,2)
MANAGER_ID NUMBER(6)

DEPARTMENT_ID NUMBER(4)

hr@ORA920> desc departments
Name Null? Type
----------------------------------- -------- ------------------------
DEPARTMENT_ID NOT NULL NUMBER(4)
DEPARTMENT_NAME NOT NULL VARCHAR2(30)
MANAGER_ID NUMBER(6)

LOCATION_ID NUMBER(4)


The problem is there is a manager id column in there as well. The relationship is really:

emp joins to dept by department_id to department_id
dept joins to emp by manager_id to employee_id

But a natural join just sees two columns with the same name.

It is a bug just WAITING to happen. Someone adds a column to a table a year from now and all of a sudden -- YOUR JOIN CONDITIONS CHANGE -- doh! Or worse, you didn't catch the 32 vs 106 condition and you have a bug from day 1.


Ahh, i see what I did wrong there:


ops$tkyte@ORA9I> select * from v$object_usage;

INDEX_NAME TABLE_NAME MON USE START_MONITORING END_MONITORING
------------------------------ ------------------------------ --- --- ------------------- -------------------
T_IDX T YES YES 03/22/2003 10:04:48

ops$tkyte@ORA9I> startup force
ORA-01031: insufficient privileges
ops$tkyte@ORA9I> connect / as sysdba
Connected.
ops$tkyte@ORA9I> startup force
ORACLE instance started.

Total System Global Area 174359528 bytes
Fixed Size 279528 bytes
Variable Size 134217728 bytes
Database Buffers 33554432 bytes
Redo Buffers 6307840 bytes
Database mounted.
Database opened.
ops$tkyte@ORA9I> select * from v$object_usage;

no rows selected

appears wiped out... but it was my mistake


ops$tkyte@ORA9I> connect /
Connected.
ops$tkyte@ORA9I> select * from v$object_usage;

INDEX_NAME TABLE_NAME MON USE START_MONITORING END_MONITORING
------------------------------ ------------------------------ --- --- ------------------- -------------------
T_IDX T YES YES 03/22/2003 10:04:48


I tagged those slides to correct that.

Rating

  (8 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

So, why was the syntax introduced?

Arup Nanda, March 23, 2003 - 10:35 pm UTC

I had similar apprehensions about using the new syntax of Natural Joins. It's just too easy to introduce a bug and certainly makes the current queries ambiguous. So, why did ANSI introduce them as a standard, and why did Oracle followed suit? They must have had some reasoning. Appreciate if you could share your comments.

Tom Kyte
March 24, 2003 - 7:37 am UTC

Answering in reverse

why did Oracle follow suit? Because industry demands it. We cannot pick and choose syntactical features. If we support the ANSI join syntax we either support it or we don't. If we leave out some -- we don't.

As to why ANSI did -- don't know, you would have to ask them. I see no advantage to it, only total disadvantage. I cannot fathom their thought process on this one.

Natural joins - bad ANSI standard

Jonathan Brain, March 24, 2003 - 4:25 am UTC

If natural joins looked for a foreign key constraint instead of matching column names and raised an exception if there was none then we might have something usefull.

How about inventing a KEY JOIN clause to use instead of NATURAL JOIN?


Natural Joins can produce cartesian products

Chip, April 07, 2003 - 3:13 am UTC

In Oracle 9i and 9i Release 2, a natural join becomes a cross join if the tables do not have any common column names.

For example, trying to list the countries for all departments using the HR example schema can produce a cartesian product (instead of an error):

hr@MSORA920>desc countries
Name Null? Type
---------------- -------- ------------
COUNTRY_ID NOT NULL CHAR(2)
COUNTRY_NAME VARCHAR2(40)
REGION_ID NUMBER

hr@MSORA920>desc departments
Name Null? Type
---------------- -------- ------------
DEPARTMENT_ID NOT NULL NUMBER(4)
DEPARTMENT_NAME NOT NULL VARCHAR2(30)
MANAGER_ID NUMBER(6)
LOCATION_ID NUMBER(4)

hr@MSORA920>SELECT country_name, department_name
2 FROM countries NATURAL JOIN departments;

675 rows selected. (25 countries * 27 departments).


Note: joining the locations table does produce the desired list of country names for all departments.

hr@MSORA920>SELECT country_name, department_name
2 FROM countries NATURAL JOIN locations NATURAL JOIN departments;

27 rows selected.


Natural Joins

Sikandar Hayat, April 28, 2003 - 4:01 am UTC

In Docs,

"The NATURAL keyword indicates that a natural join is being performed. A natural join is based on all columns in the two tables that have the same name. It selects rows from the two tables that have equal values in the relevant columns. When specifying columns that are involved in the natural join, do not qualify the column name with a table name or table alias."

My queries is that with Natural we can't join more than 2 tables?




Tom Kyte
April 28, 2003 - 7:26 am UTC


scott@ORA920> select *
2 from ( emp natural join dept ) natural join bonus
3 /

no rows selected


it works syntactically -- don't don't ever use a natural join in real life, forget they exist.

What's behind Natural Join

Bhavesh Patel, March 31, 2005 - 9:04 am UTC

Hi Tom,
I try following three queries at my own :

1)select ename,dname
from scott.emp,scott.dept
where emp.deptno=dept.deptno

Statistics
----------------------------------------------------------
300 recursive calls
0 db block gets
56 consistent gets
2 physical reads
0 redo size
1052 bytes sent via SQL*Net to client
430 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
15 rows processed

2) select ename,dname
from scott.emp natural join scott.dept;

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
1052 bytes sent via SQL*Net to client
430 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
15 rows processed


3)select ename,dname
from scott.emp inner join scott.dept
on emp.deptno=dept.deptno;

Statistics
----------------------------------------------------------
361 recursive calls
0 db block gets
73 consistent gets
3 physical reads
0 redo size
1052 bytes sent via SQL*Net to client
430 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
15 rows processed

I want to know that why natural join is so lightweighted??
what is the reason behind it??
Which things are happening whent we join naturally?

Please Reply.






Tom Kyte
March 31, 2005 - 9:32 am UTC

whats up with the massive recursive SQL?

do this, run the queries (and ignore the output)

then run them with autotrace (exactly the same queries) and show us that.

Here Is the output

Bhavesh Patel, March 31, 2005 - 11:44 pm UTC

SQL> analyze table scott.emp
  2  compute statistics;

Table analyzed.

SQL> set autotrace on
SQL> select ename,dname
  2  from scott.emp,scott.dept
  3  where emp.deptno=dept.deptno;

ENAME      DNAME
---------- --------------
harsh      ACCOUNTING
BHAVESH    ACCOUNTING
KING       ACCOUNTING
CLARK      ACCOUNTING
RAjeev     RESEARCH
ADAMS      RESEARCH
SCOTT      RESEARCH
JONES      RESEARCH
SMITH      RESEARCH
JAMEdsdS   SALES
TURNER     SALES

ENAME      DNAME
---------- --------------
BLAKE      SALES
MARTIN     SALES
WARD       SALES
ALLEN      SALES

15 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=410 Bytes=123
          00)

   1    0   HASH JOIN (Cost=5 Card=410 Bytes=12300)
   2    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=15 Bytes=120)
   3    1     TABLE ACCESS (FULL) OF 'DEPT' (Cost=2 Card=82 Bytes=1804
          )





Statistics
----------------------------------------------------------
        341  recursive calls
          0  db block gets
         71  consistent gets
          3  physical reads
          0  redo size
       1052  bytes sent via SQL*Net to client
        430  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          6  sorts (memory)
          0  sorts (disk)
         15  rows processed


2)SQL>  select ename,dname
  2   from scott.emp natural join scott.dept;

ENAME      DNAME
---------- --------------
harsh      ACCOUNTING
BHAVESH    ACCOUNTING
KING       ACCOUNTING
CLARK      ACCOUNTING
RAjeev     RESEARCH
ADAMS      RESEARCH
SCOTT      RESEARCH
JONES      RESEARCH
SMITH      RESEARCH
JAMEdsdS   SALES
TURNER     SALES

ENAME      DNAME
---------- --------------
BLAKE      SALES
MARTIN     SALES
WARD       SALES
ALLEN      SALES

15 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=410 Bytes=123
          00)

   1    0   HASH JOIN (Cost=5 Card=410 Bytes=12300)
   2    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=15 Bytes=120)
   3    1     TABLE ACCESS (FULL) OF 'DEPT' (Cost=2 Card=82 Bytes=1804
          )





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
       1052  bytes sent via SQL*Net to client
        430  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         15  rows processed

    Tom,I do not know much about Oracle structure..Right now I am learning SQL,
and By readin your article I try to trace out the things, I think here natural join doesn't
make that effect..but that must be some cache ..
    Pls. clarify the things And sorry If anythings wrong from my side..(I am sure that i done something wrong 
with trace But what that i dont know..) 

Tom Kyte
April 01, 2005 - 8:24 am UTC

lets see a tkprof here and make sure to run both queries (getting all recursive sql that need not be there out of the way) and then turn on trace and then run them again.

just put up the relevant portions of the tkprof, just the two sections for each query (along with the plans you see in the report).

so:

connect
analyze
select 1
select 2
alter session set sql_trace=true;
select 1
select 2
exit

tkprof tracefile outputfile

edit outputfile and post just the two sections.

All things Clear

Bhavesh Ghodasara, April 01, 2005 - 9:53 am UTC

Hi Thomas,
Now all the things are clear to me..
its just because that i execute first the query of natural join and after that inner join(oracle)..
So Its works sam..
Thanks For Your time..


The real bug is in your code!

Jamie Collins, May 23, 2013 - 10:09 am UTC

Your second query, which you claim is a "hand coded natural join", is not semantically equivalent to the first query, which uses the NATURAL JOIN syntax. Of course, you go on to point out your mistake (i.e. you omitted MANAGER_ID from the join condition) but fail to attribute this to your code, instead blaming the implementation (even though it seemingly complies with the SQL Standard).

The thrust of your argument seems to be that for the casual observer the two tables should be joined using DEPARTMENT_ID alone. Surely, then, your complaint should be directed to the designer of the model who chose to use the same element name for two different predicates, "Employee...reports to manager MANAGER_ID" and "Department... is headed by manager MANAGER_ID" respectively, and commits this design flaw on tables that are expected to be frequently joined in SQL DML.

Finally, consider that many aspects of SQL are non-intuitive, the prime example being nulls. Not only is three-valued logic (3VL) hard for users familiar with first-order logic to comprehend, 3VL is inconsistently implemented in SQL, one must learn all the exceptional cases (e.g. that SUM for a column of type INTEGER that is all nulls returns NULL but COUNT for the same column returns 0). An analogue of your answer would be a novice using the equals operator involving nulls e.g. using NULL = NULL and intuitively thinking it to evaluate TRUE. I'm sure you wouldn't claim this was a bug, rather attribute the mistake to the user's unfamiliarity with the language or the data model.
Tom Kyte
May 23, 2013 - 3:02 pm UTC

... Your second query, which you claim is a "hand coded natural join", is not
semantically equivalent to the first query, which uses the NATURAL JOIN syntax. ...

I know, that is the point I was trying to make???

I did not fail to attribute this to my code, I pointed out that the natural join syntax is stupid and error prone. It does, it can, it has caused many a mistake. there is a zero percent chance I would ever use it.

which is what I wrote above.

I don't have a problem with the data model. I have a problem with the natural join stuff which joins two tables together based on column naming conventions - and not foreign keys.

Now, if natural join read the dictionary and looked for the primary key/foreign key relationships (an unambiguous one) and did the join based on that - then I'd be impressed.

Using the same column names? beyond naive. beyond not smart. I don't know why the sql committee did that. It just isn't smart.


Finally, consider that many aspects of SQL are non-intuitive,

please replace SQL with <any language goes here>.


I'm sure you wouldn't claim this was a bug, rather attribute the mistake
to the user's unfamiliarity with the language or the data model.


well, umm, YES - why wouldn't I? Do you learn a language before you use it? What does?????

Here, take this code:


#include "stdio.h"

int main()
{
   int x = 42;
   int y =  0;

   x = x + (--y + y--);

   printf( "x = %d, y = %d\n", x, y );
   return 0;
}




what is the answer? Unless you know C, unless you really know C, you won't be able to answer that short of guessing and you won't be able to write safe C code (well, I guess you could write the most pedantic C code ever using 1% of the language.... hmmm, that sounds like a lot of SQL coders I know...)


The answer by the way is 40, because Y is decremented once with the --y, resulting in -1 being added to -1 and then y is decremented again resulting in -2. The answer is:

[tkyte@localhost ~]$ make test
cc     test.c   -o test
[tkyte@localhost ~]$ ./test
x = 40, y = -2
[tkyte@localhost ~]$ 



Not intuitive, unless of course you knew C.