Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Anil.

Asked: December 18, 2004 - 2:22 pm UTC

Last updated: December 20, 2004 - 9:36 am UTC

Version: 10.1.0.3.0

Viewed 1000+ times

You Asked

Hi Tom,

We have a requirement to comapre city code against locations where the input values is allways a city code but the stored value in the database will be either city,country,region or area. (for example for city NYC rule is applicable if the location column has either NYC or US or AMERICA or AREA1

Now location can have either of these 4 values . Would you please tell me best way to write a query for this

I have tried couple of things. I kept a mapping table for each city

craete table city_map (
CITY VARCHAR2(10) ,
country VARCHAR2(10),
Region VARCHAR2(10) ,
AREA VARCHAR2(10) )

and tried following queries
note emb is used for testing
which has following columns
create table emb (
code varchar2(5),
location varchar2(10), -- either city or country or region or area
level varchar2(10)) ---> Here I keep the level for city =1 country = 2 so on

Data in emb would be

PXX,US,2
PYY,NYC,1
PUU,LON,1,
PUU,UK,1

Value in

select code from emb a,city_map b
where (location=b.city
or location=b.country
)
and city ='NYC'

ODE LOCATION
----- ----------
PXX US
PYY NYC

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=29 Card=22576 Bytes=564400)
1 0 CONCATENATION
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'EMB' (TABLE) (Cost=13 Card=1 Bytes=8)
3 2 NESTED LOOPS (Cost=14 Card=1 Bytes=25)
4 3 TABLE ACCESS (BY INDEX ROWID) OF 'CITY_MAP' (TABLE) (Cost=1 Card=1 Bytes=17)
5 4 INDEX (RANGE SCAN) OF 'CITY_MAP1' (INDEX) (Cost=1 Card=1)
6 3 INDEX (RANGE SCAN) OF 'EMB1' (INDEX) (Cost=0 Card=22575)
7 1 TABLE ACCESS (BY INDEX ROWID) OF 'EMB' (TABLE) (Cost=13 Card=22575 Bytes=180600)
8 7 NESTED LOOPS (Cost=14 Card=22575 Bytes=564375)
9 8 TABLE ACCESS (BY INDEX ROWID) OF 'CITY_MAP' (TABLE) (Cost=1 Card=1 Bytes=17)
10 9 INDEX (RANGE SCAN) OF 'CITY_MAP1' (INDEX) (Cost=1 Card=1)
11 8 INDEX (RANGE SCAN) OF 'EMB1' (INDEX) (Cost=0 Card=22575)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
14 consistent gets
0 physical reads
0 redo size
426 bytes sent via SQL*Net to client
512 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

Gave me good results with two or clauses but when I added all the 4
i got
select code from emb a,city_map b
where (location=b.city
or location=b.country
or location=b.region
or location=b.area
)
and city ='NYC'

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=36 Card=33863 Bytes=1422246)
1 0 NESTED LOOPS (Cost=36 Card=33863 Bytes=1422246)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'CITY_MAP' (TABLE) (Cost=1 Card=1 Bytes=34)
3 2 INDEX (RANGE SCAN) OF 'CITY_MAP1' (INDEX) (Cost=1 Card=1)
4 1 TABLE ACCESS (FULL) OF 'EMB' (TABLE) (Cost=35 Card=33863 Bytes=270904)




Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
119 consistent gets
0 physical reads
0 redo size
426 bytes sent via SQL*Net to client
512 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

I also tried following query but could not get any better

select code,location,clevel from emb b
where decode (clevel,'1',(select city from city_map where city='MWG'),
'2', (select country from city_map where city='MWG'),
'3', (select region from city_map where city='MWG'),
'4', (select area from city_map where city='MWG'))=location


Would you please suggest a better way. or is it a better way to approch this scneario.

Thanks and regards
Anil

and Tom said...

this is a strange looking relational data model don't you think?

(and LEVEL is a really bad bad BAD idea for a column name!!! we have a builtin function of the same name!! you need to change that). also, why is level a varchar2 if you have numbers only it in?

but isn't it just:

select *
from emb
where (location,lvl) in ( select decode(x.lvl,
'1',y.city,
'2',y.country,
'3',y.region,
'4',y.area),
x.lvl
from city_map y,
(select '1' lvl from dual union all
select '2' lvl from dual union all
select '3' lvl from dual union all
select '4' lvl from dual) x
where y.city = :BV )


At the very least, city_map should be:

city, lvl, location, primary key(city,lvl)


with 4 rows per city -- so you can just:

where in ( select lvl, location from city_map where city = :bv )


that is what I'm doing with the cartesian product above.


Consider:



ops$tkyte@ORA9IR2> create table city_map (
2 CITY VARCHAR2(10) primary key,
3 country VARCHAR2(10),
4 Region VARCHAR2(10) ,
5 AREA VARCHAR2(10) ) ;
Table created.

ops$tkyte@ORA9IR2> insert into city_map values ( 'NYC', 'US', 'AMERICA', 'AREA1' );
1 row created.

ops$tkyte@ORA9IR2> exec gen_data('city_map',10000);
PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> create table emb (
2 code varchar2(5),
3 location varchar2(10),
4 lvl varchar2(10))
5 /

Table created.

ops$tkyte@ORA9IR2> create index emb_idx on emb(location,lvl);
Index created.

ops$tkyte@ORA9IR2> insert into emb values ( 'PXX', 'US', 2 );
1 row created.

ops$tkyte@ORA9IR2> insert into emb values ( 'PYY', 'NYC', 1 );
1 row created.

ops$tkyte@ORA9IR2> insert into emb values ( 'PUU', 'LON', 1 );
1 row created.

ops$tkyte@ORA9IR2> insert into emb values ( 'PUU', 'UK', 1 );
1 row created.

ops$tkyte@ORA9IR2> exec gen_data( 'emb', 50000 );
PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'CITY_MAP', cascade=>true );
PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'EMB', cascade=>true );
PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> variable bv varchar2(20)
ops$tkyte@ORA9IR2> exec :bv := 'NYC';
PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> select *
2 from emb
3 where (location,lvl) in ( select decode(x.lvl,
4 '1',y.city,
5 '2',y.country,
6 '3',y.region,
7 '4',y.area),
8 x.lvl
9 from city_map y,
10 (select '1' lvl from dual union all
11 select '2' lvl from dual union all
12 select '3' lvl from dual union all
13 select '4' lvl from dual) x
14 where y.city = :BV )
15 /

CODE LOCATION LVL
----- ---------- ----------
PYY NYC 1
PXX US 2

ops$tkyte@ORA9IR2> set autotrace on
ops$tkyte@ORA9IR2> select *
2 from emb
3 where (location,lvl) in ( select decode(x.lvl,
4 '1',y.city,
5 '2',y.country,
6 '3',y.region,
7 '4',y.area),
8 x.lvl
9 from city_map y,
10 (select '1' lvl from dual union all
11 select '2' lvl from dual union all
12 select '3' lvl from dual union all
13 select '4' lvl from dual) x
14 where y.city = :BV )
15 /

CODE LOCATION LVL
----- ---------- ----------
PYY NYC 1
PXX US 2


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=219 Card=1 Bytes=37)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'EMB' (Cost=2 Card=1 Bytes=27)
2 1 NESTED LOOPS (Cost=219 Card=1 Bytes=37)
3 2 VIEW OF 'VW_NSO_1' (Cost=217 Card=1 Bytes=10)
4 3 SORT (UNIQUE) (Cost=217 Card=1 Bytes=46)
5 4 NESTED LOOPS (Cost=70 Card=32672 Bytes=1502912)
6 5 TABLE ACCESS (BY INDEX ROWID) OF 'CITY_MAP' (Cost=2 Card=1 Bytes=43)
7 6 INDEX (UNIQUE SCAN) OF 'SYS_C004447' (UNIQUE) (Cost=1 Card=10001)
8 5 VIEW (Cost=68 Card=32672 Bytes=98016)
9 8 UNION-ALL
10 9 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
11 9 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
12 9 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
13 9 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
14 2 INDEX (RANGE SCAN) OF 'EMB_IDX' (NON-UNIQUE) (Cost=1 Card=1)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
25 consistent gets
0 physical reads
0 redo size
530 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
2 rows processed

that is not horrible -- 25 lios for 2 rows, but it isn't "good", we can do better

ops$tkyte@ORA9IR2> set autotrace off

ops$tkyte@ORA9IR2> create table city_map_better
2 ( city, lvl, location, primary key(city,lvl) )
3 organization index

4 as
5 ( select y.city,
6 x.lvl,
7 decode(x.lvl,
8 '1',y.city,
9 '2',y.country,
10 '3',y.region,
11 '4',y.area)
12 from city_map y,
13 (select '1' lvl from dual union all
14 select '2' lvl from dual union all
15 select '3' lvl from dual union all
16 select '4' lvl from dual) x
17 )
18 /
Table created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'CITY_MAP_BETTER', cascade=>true );
PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> select *
2 from emb
3 where (location,lvl) in ( select location,lvl
4 from city_map_better
5 where city = :bv )
6 /

CODE LOCATION LVL
----- ---------- ----------
PYY NYC 1
PXX US 2

ops$tkyte@ORA9IR2> set autotrace on
ops$tkyte@ORA9IR2> select *
2 from emb
3 where (location,lvl) in ( select location,lvl
4 from city_map_better
5 where city = :bv )
6 /

CODE LOCATION LVL
----- ---------- ----------
PYY NYC 1
PXX US 2


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=1 Bytes=50)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'EMB' (Cost=2 Card=1 Bytes=27)
2 1 NESTED LOOPS (Cost=17 Card=1 Bytes=50)
3 2 SORT (UNIQUE)
4 3 INDEX (RANGE SCAN) OF 'SYS_IOT_TOP_35286' (UNIQUE) (Cost=2 Card=4 Bytes=92)
5 2 INDEX (RANGE SCAN) OF 'EMB_IDX' (NON-UNIQUE) (Cost=1 Card=1)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11 consistent gets
0 physical reads
0 redo size
530 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
2 rows processed

ops$tkyte@ORA9IR2> set autotrace off

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Very useful

Anil Shafeeque, December 19, 2004 - 12:08 am UTC

Thanks Tom for the Quick reply, It looks excellnet. I would like to add someting. I remember you saying test,test,test. so Inspite of sending you a question I continue trying different models. 

First of all I want to say that my field name is clevel, In it was typing mistake. With the same model I could write another query which is as follows

ANIL@NGDEV1-SQL> select code ,location from emb a
where location in (
select decode( r, 1, city, 2, country, 3, region,4,area)
    from city_map, (select rownum r from p1 where rownum <= 3 )
   where city= 'MWG')
  2    3    4    5    6
ANIL@NGDEV1-SQL> /

CODE  LOCATION
----- ----------
PXX   IN
PYY   MWG


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=19 Card=22575 Bytes=338625)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMB' (TABLE) (Cost=14 Card=22575 Bytes=180600)
   2    1     NESTED LOOPS (Cost=19 Card=22575 Bytes=338625)
   3    2       VIEW OF 'VW_NSO_1' (VIEW) (Cost=5 Card=1 Bytes=7)
   4    3         SORT (UNIQUE) (Cost=5 Card=1 Bytes=47)
   5    4           MERGE JOIN (CARTESIAN) (Cost=4 Card=3 Bytes=141)
   6    5             TABLE ACCESS (BY INDEX ROWID) OF 'CITY_MAP' (TABLE) (Cost=1 Card=1 Bytes=34)
   7    6               INDEX (RANGE SCAN) OF 'CITY_MAP1' (INDEX) (Cost=1 Card=1)
   8    5             BUFFER (SORT) (Cost=3 Card=3 Bytes=39)
   9    8               VIEW (Cost=3 Card=3 Bytes=39)
  10    9                 COUNT (STOPKEY)
  11   10                   TABLE ACCESS (FULL) OF 'P1' (TABLE) (Cost=3 Card=3)
  12    2       INDEX (RANGE SCAN) OF 'EMB1' (INDEX) (Cost=2 Card=22575)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         14  consistent gets
          0  physical reads
          0  redo size
        491  bytes sent via SQL*Net to client
        512  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          2  rows processed


where p1 is a simple one column table with 3 rows.

In this case I do not have to worry about the level, level is irelavant here. I am sure that value of city,country,region,area are unique to each other. I hope I explains.

My LIO here is 14 here. How do you rate this query. 

After reciving your reply I tried with IOT. I have changed the city_map to city_map_better


create table city_map_better
    ( city varchar2(10), location varchar2(10), primary key(city,location) )
    organization index

Loaded 50000 records

tried the following query
Here LIO is reduced to 9 . please comment
ANIL@NGDEV1-SQL> select code ,location from emb a
where location in (select location from city_map_better
where city='MWG')
  2    3    4
ANIL@NGDEV1-SQL> /

CODE  LOCATION
----- ----------
PXX   IN
PYY   MWG


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=15 Card=44707 Bytes=983554)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMB' (TABLE) (Cost=13 Card=22575 Bytes=180600)
   2    1     NESTED LOOPS (Cost=15 Card=44707 Bytes=983554)
   3    2       SORT (UNIQUE) (Cost=2 Card=2 Bytes=28)
   4    3         INDEX (RANGE SCAN) OF 'SYS_IOT_TOP_57557' (INDEX (UNIQUE)) (Cost=2 Card=2 Bytes=28)
   5    2       INDEX (RANGE SCAN) OF 'EMB1' (INDEX) (Cost=0 Card=22575)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          9  consistent gets
          0  physical reads
          0  redo size
        491  bytes sent via SQL*Net to client
        512  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          2  rows processed



Thank you very much for you valuable advice. Very much appreciated.

Rgds
Anil 
 

Tom Kyte
December 19, 2004 - 10:41 am UTC

I believe "clevel" to be relevant -- your flat model that you have to pivot is (to me) non-intuitive.

read the above two queries of yours and ask yourself "which will make total sense to a human being 5 years from now"

LIO rules?

Gabe, December 19, 2004 - 10:46 am UTC

<Anil quote>I am sure that value of city, country, region, area are unique to each other.</Anil quote>

In that case consider implementing city_map as an IOT with pk=(city,country,region,area). Well, since this pk would likely get migrated, you may want a surrogate for the pk and uk on the composite.

<Tom quote>
At the very least, city_map should be: city, lvl, location, primary key(city,lvl) … with 4 rows per city</Tom quote>

This doesn’t look very relational either. You did say <quote>at the very least</quote> … it is true … but just so Anil has an idea of what is missing:
1. there should be a check constraint on LVL .. should be 1,2,3 or 4
2. LOCATION is still not constrained in any way
3. London exists both in UK and Canada … LOCATION should be in the PK as well.

Arguably a relational model for this should be:
1. Area table with AREA_NAME constrained to be unique
2. Region table with AREA_NAME, REGION_NAME constrained to be unique (AREA_NAME being a FK migrated from Area table)
3. Country table with AREA_NAME, REGION_NAME, COUNTRY_NAME constrained to be unique (AREA_NAME, REGION_NAME being a FK migrated from Region table)
4. City table with AREA_NAME, REGION_NAME, COUNTRY_NAME, CITY_NAME constrained to be unique (AREA_NAME, REGION_NAME, COUNTRY_NAME being a FK migrated from Country table)
Note: at the very least the City table should have a surrogate pk in order to avoid migrating names when implementing fks … personally I would give surrogate pks to all of them.
5. EMB table defined after one finds out what exactly is it supposed to implement … personally I don’t see why one has to contrive a Parent table definition when there is no clue as to what Child is supposed to be … that is, why does EMB has to have LOCATION and LVL anyway?

Modeling strictly by LIO is a _methodology_ I won’t follow.


Tom Kyte
December 19, 2004 - 11:43 am UTC

I'm ignoring his names of city, country, region, area -- in the context he is using them, they don't have that "meaning"

they are in fact locations... multiple location names for the same city.


In looking at the information supplied, the model I would use is above. I would not have the four columns when I most often needed them as four ROWS.



true ...

Gabe, December 20, 2004 - 9:36 am UTC

I re-read the original question and I do see your point … although the quote from Anil which I listed sounds tantalizingly close to a hierarchy.

<quote>they are in fact locations... multiple location names for the same city.</quote>
This would imply we can tell the _cities_ apart (hence there would be a City table … pk=code) and Location (which you had as “city_map_better”) would be a dependent of City … the level 2 label Location for ‘NYC’, which is let us say
‘US’, would be a separate row from the level 2 label Location for ‘LA’ (Los Angeles), which could be ‘US’, ‘USA’, ‘U.S.’ or anything else indeed … the
model won’t imply any sort of relationship between them except for being level 2 locations. This sounds OK … maybe I was looking (and making assumptions) at a larger picture than it is seemed to have been required.

I’ll agree with your last phrase there … emphasis on <quote>In looking at the information supplied</quote>. Since this looks now like mostly static information, one could provide a mv for the pivot and hence both views (as 4
rows and as 4 columns) could be made available.


More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library