## Question and Answer

## You Asked

Team,

We have a table in an application that has longitude and lattitue as NUMBER datatype.

also composite index exists on the columns (LATITUDE, LONGITUDE) in this order.

the sql from the application goes like this.

select *

from the_table

where Latitude BETWEEN 39.188920190130624 AND 42.091677809869374

AND longitude BETWEEN -75.78729387674332 AND -72.017976123256688 ;

this index doesn't help us here, and it goes for Full table scan. so planning to have a spatial index like this.

alter table the_table add x sdo_geometry;

update the_table

set x = sdo_geometry( 2001,null,sdo_point_type( LONGITUDE ,LATITUDE,null),null,null )

where latitude is not null;

commit;

insert into user_sdo_geom_metadata( table_name,column_name,diminfo,srid)

values( 'THE_TABLE' , 'X', mdsys.sdo_dim_array( mdsys.sdo_dim_element('X',-180,180,0.00005),

mdsys.sdo_dim_element('Y',-90,90,0.00005)),null);

commit;

create index test_idx

on the_table( x )

indextype is mdsys.spatial_index;

exec dbms_stats.gather_table_stats( user,'the_table');

so, could you help us how to rewrite the above sql statement where clause? do i need to go with LOCATOR_WITHIN_DISTANCE or SDO_COVERS operator?

We have a table in an application that has longitude and lattitue as NUMBER datatype.

also composite index exists on the columns (LATITUDE, LONGITUDE) in this order.

the sql from the application goes like this.

select *

from the_table

where Latitude BETWEEN 39.188920190130624 AND 42.091677809869374

AND longitude BETWEEN -75.78729387674332 AND -72.017976123256688 ;

this index doesn't help us here, and it goes for Full table scan. so planning to have a spatial index like this.

alter table the_table add x sdo_geometry;

update the_table

set x = sdo_geometry( 2001,null,sdo_point_type( LONGITUDE ,LATITUDE,null),null,null )

where latitude is not null;

commit;

insert into user_sdo_geom_metadata( table_name,column_name,diminfo,srid)

values( 'THE_TABLE' , 'X', mdsys.sdo_dim_array( mdsys.sdo_dim_element('X',-180,180,0.00005),

mdsys.sdo_dim_element('Y',-90,90,0.00005)),null);

commit;

create index test_idx

on the_table( x )

indextype is mdsys.spatial_index;

exec dbms_stats.gather_table_stats( user,'the_table');

so, could you help us how to rewrite the above sql statement where clause? do i need to go with LOCATOR_WITHIN_DISTANCE or SDO_COVERS operator?

with LiveSQL Test Case:

## and we said...

I spoke to one of the Spatial Product Managers to get this response:

Using spatial queries is definitely the better option, especially for long/lat coordinates, and is the only way if you want to search for more than points in a rectangle. For that however, we need actual geometry objects (in SDO_GEOMETRY). I am starting with this table:

Then I expand that table with a geometry column:

And I populate the geometries:

Notice the number 4326. This number is a spatial reference system ID (or SRID). It says that the coordinates are not just any random numbers, but that they are longitude and latitude in GPS coordinates (also known as WGS84 coordinates). This is extremely important. If we do not tell the database what coordinate system we are using, then we will not be able to to meaningful comparisons or measurements. This is especially true for spherical coordinates (long/lat). If we do not say that they are spherical, then any measurement (area, length, distances, ...) will be totally wrong. Also we will not be able to correlate them with other geographic shapes, in other coordinate systems.

Coordinate systems for spatial data are like time zones for time. If I call for a tele conference at 2PM and send that to my people in Moscow, Berlin, Tokyo and New York, all of them will dial-in at 2PM - and find no-one else: they dialed at 2PM in their time zone. I need to specify 2PM CET for example. Same for spatial coordinates: if I do not specify in which coordinate system they are expressed, I will be unable to compare them.

Coordinate systems come in two flavors. Those that are represented on the spherical representation of the globe (called geodetic systems). And where you may think there is only one universal long/lat system, there are actually hundreds of different long/lats, each one using a slightly different representation of the earth, adapted to local conditions. But when nothing is specified, then in 99.9% of the case, it means the long/lat coordinates are in the WGS84 system, the one used by the GPS (and the other global positioning system). Incidentally: the earth is NOT a sphere. It is really slightly flattened at the poles (due to its rotation) … Also there used to be some systems using a different meridian for the origin (other than Greenwich): some old french systems used the Paris meridian (obviously).

The other kind of coordinate systems are projections: the coordinates are meters from some origin. There are literally thousands of those (the oracle database has some 5000 of them). Each country has several because projections lose accuracy on large areas. Each US state has several projections. There are also old and new projections - Oracle provides functions to convert from one coordinate system to the other.

To use the spatial types, you need to define metadata:

Where:

- T1 is the table name and LOCATION the name of the SDO_GEOMETRY column.

- The DIMINFO structure describes the bounds of each dimension. For long/lat this is always -180 to 180 and -90 to 90.

- The value 0.05 in this structure (the same for each dimension) represents the tolerance of the spatial data in that table. It is expressed in meters and represents the minimum distance between two points for being considered distinct. It is not very important when dealing with customer locations (since those are very approximate), but it is important when it comes to determine if two adjacent land parcels are really adjacent. Here 0.05 means 5cm. It also means that if two parcels are separated

That is actually a resource-consuming operation, quite CPU-intensive, more so for long/lat coordinates.

Now we can start doing some queries. To do some of them, we use a table that contains the shape of all countries worldwide. You can find it along with other tables via SPATIAL_COUNTRIES and importing via traditional import. The imported tables are:

And then we can do some sample queries:

1) The original query:

2) Find all points within 20km from a given point:

3) How far are they ?

Note that point 42 is the closest - at a distance of 0 meters.

4) How many points are in Germany ?

5) How many points in each country ?

6) Countries with the largest number of points

7) Points not in any country

1 row selected.

8) Points within a radius of 100km around Denver

Using spatial queries is definitely the better option, especially for long/lat coordinates, and is the only way if you want to search for more than points in a rectangle. For that however, we need actual geometry objects (in SDO_GEOMETRY). I am starting with this table:

create table t1 as select rownum id, rpad(rownum,10) data, dbms_random.value(-90,90) lat, dbms_random.value(-180,180) lng from (select 1 from dual connect by level <= 1000), (select 1 from dual connect by level <= 1000);

Then I expand that table with a geometry column:

alter table t1 add location sdo_geometry;

And I populate the geometries:

update t1 set location = sdo_geometry(2001,4326,sdo_point_type(lng,lat,null),null,null); commit;

Notice the number 4326. This number is a spatial reference system ID (or SRID). It says that the coordinates are not just any random numbers, but that they are longitude and latitude in GPS coordinates (also known as WGS84 coordinates). This is extremely important. If we do not tell the database what coordinate system we are using, then we will not be able to to meaningful comparisons or measurements. This is especially true for spherical coordinates (long/lat). If we do not say that they are spherical, then any measurement (area, length, distances, ...) will be totally wrong. Also we will not be able to correlate them with other geographic shapes, in other coordinate systems.

Coordinate systems for spatial data are like time zones for time. If I call for a tele conference at 2PM and send that to my people in Moscow, Berlin, Tokyo and New York, all of them will dial-in at 2PM - and find no-one else: they dialed at 2PM in their time zone. I need to specify 2PM CET for example. Same for spatial coordinates: if I do not specify in which coordinate system they are expressed, I will be unable to compare them.

Coordinate systems come in two flavors. Those that are represented on the spherical representation of the globe (called geodetic systems). And where you may think there is only one universal long/lat system, there are actually hundreds of different long/lats, each one using a slightly different representation of the earth, adapted to local conditions. But when nothing is specified, then in 99.9% of the case, it means the long/lat coordinates are in the WGS84 system, the one used by the GPS (and the other global positioning system). Incidentally: the earth is NOT a sphere. It is really slightly flattened at the poles (due to its rotation) … Also there used to be some systems using a different meridian for the origin (other than Greenwich): some old french systems used the Paris meridian (obviously).

The other kind of coordinate systems are projections: the coordinates are meters from some origin. There are literally thousands of those (the oracle database has some 5000 of them). Each country has several because projections lose accuracy on large areas. Each US state has several projections. There are also old and new projections - Oracle provides functions to convert from one coordinate system to the other.

To use the spatial types, you need to define metadata:

insert into user_sdo_geom_metadata (table_name,column_name,diminfo,srid) values ( 'T1' , 'LOCATION', sdo_dim_array( sdo_dim_element('Longitude',-180,180,0.05), sdo_dim_element('Latgitude',-90,90,0.05) ), 4326 ); commit;

Where:

- T1 is the table name and LOCATION the name of the SDO_GEOMETRY column.

- The DIMINFO structure describes the bounds of each dimension. For long/lat this is always -180 to 180 and -90 to 90.

- The value 0.05 in this structure (the same for each dimension) represents the tolerance of the spatial data in that table. It is expressed in meters and represents the minimum distance between two points for being considered distinct. It is not very important when dealing with customer locations (since those are very approximate), but it is important when it comes to determine if two adjacent land parcels are really adjacent. Here 0.05 means 5cm. It also means that if two parcels are separated

create index t1_sx on t1 (location) indextype is mdsys.spatial_index;

That is actually a resource-consuming operation, quite CPU-intensive, more so for long/lat coordinates.

Now we can start doing some queries. To do some of them, we use a table that contains the shape of all countries worldwide. You can find it along with other tables via SPATIAL_COUNTRIES and importing via traditional import. The imported tables are:

TABLE_NAME NUM_ROWS ------------------------------ ---------- US_CITIES 195 US_COUNTIES 3230 US_INTERSTATES 239 US_PARKS 6371 US_RASTERS 1 US_RIVERS 56 US_STATES 56 WORLD_CONTINENTS 111 WORLD_COUNTRIES 251 31 rows selected.

And then we can do some sample queries:

1) The original query:

select t1.id from t1 where sdo_inside ( location, sdo_geometry( 2003,4326,null, sdo_elem_info_array(1,1003,3), sdo_ordinate_array ( -75.78729387674332, 39.188920190130624, -72.017976123256688, 42.091677809869374 ) ) ) = 'TRUE';

2) Find all points within 20km from a given point:

select b.id from t1 a, t1 b where a.id = 42 and sdo_within_distance (b.location, a.location, 'distance=20 unit=km') = 'TRUE'; ID ---------- 29373 42 954025 922187 4 rows selected.

3) How far are they ?

select b.id, sdo_geom.sdo_distance(a.location, b.location, 0.05, 'unit=m') distance from t1 a, t1 b where a.id = 42 and sdo_within_distance (b.location, a.location, 'distance=20 unit=km') = 'TRUE' order by distance; ID DISTANCE ---------- ---------- 42 0 29373 10881.377 954025 12654.2977 922187 17152.4322 4 rows selected.

Note that point 42 is the closest - at a distance of 0 meters.

4) How many points are in Germany ?

select count(*) from t1 p, world_countries w where sdo_inside (p.location, w.geometry) = 'TRUE' and w.cntry_name = 'Germany'; COUNT(*) ---------- 716 1 row selected.

5) How many points in each country ?

select w.cntry_name, count(*) from t1 p, world_countries w where sdo_inside (p.location, w.geometry) = 'TRUE' group by w.cntry_name order by w.cntry_name; CNTRY_NAME COUNT(*) ---------------------------------------- ---------- Afghanistan 970 Albania 51 Algeria 3315 Andorra 1 Angola 1572 Antarctica 81817 Antigua and Barbuda 1 ... Western Samoa 1 Yemen 533 Zaire 2901 Zambia 1023 Zimbabwe 499 203 rows selected.

6) Countries with the largest number of points

select w.cntry_name, count(*) npoints from t1 p, world_countries w where sdo_inside (p.location, w.geometry) = 'TRUE' group by w.cntry_name order by npoints desc fetch first 5 rows only; CNTRY_NAME NPOINTS ---------------------------------------- ---------- Antarctica 81817 Russia 45255 Canada 26235 United States 17071 China 14642 5 rows selected.

7) Points not in any country

select count(*) from t1 where id not in ( select p.id from t1 p, world_countries w where sdo_inside (p.location, w.geometry) = 'TRUE' ); COUNT(*) ---------- 680527

1 row selected.

8) Points within a radius of 100km around Denver

select count(*) from t1, us_cities c where c.city = 'Denver' and sdo_within_distance (t1.location, c.location, 'distance=100 unit=km') = 'TRUE'; COUNT(*) ---------- 51 1 row selected.

## Rating

Is this answer out of date? If it is, please let us know via a Comment

# Comments

Connor, could we adapt your Spatial technique to a BETWEEN query which involves 2 columns and just simple numbers ?

e.g.

I've followed a technique in this link;

https://www.orafaq.com/node/1798

...which works well for my SINGLE ROW query and indexed on (end_key, start_key).

However, performance ain't so great when there's is no rows found for the BIND value.

Looking at your Spatial example, something like this perhaps;

I just wasn't sure how I can get Spatial (the SDO_GEOMETRY column and it's index) to work for "normal" numbers that aren't latitude/longitude.

I would like to try Spatial out to see if it can handle BETWEEN queries better than the method I'm currently using which works well ONLY when there is data found.

e.g.

select * from some_table where start_key <= :1 and end_key >= :1 and rownum < 2

I've followed a technique in this link;

https://www.orafaq.com/node/1798

...which works well for my SINGLE ROW query and indexed on (end_key, start_key).

However, performance ain't so great when there's is no rows found for the BIND value.

Looking at your Spatial example, something like this perhaps;

select * from some_table st where sdo_within_distance (st.start_key, st.end_key, '??? not sure what goes here ???') = 'TRUE';

I just wasn't sure how I can get Spatial (the SDO_GEOMETRY column and it's index) to work for "normal" numbers that aren't latitude/longitude.

I would like to try Spatial out to see if it can handle BETWEEN queries better than the method I'm currently using which works well ONLY when there is data found.

Yeah, I'm not sure that is going to fly.

You would need to map max/min keys to the allowable lat/long ranges...and explaining *that* the next person that needs to maintain your code is going to be ..... fun :-)

I'd perhaps look at some granular 2-level partitioning on start and end keys. The indexing then gives you fast single row lookups, and the partitioning give you fast range lookups.

You would need to map max/min keys to the allowable lat/long ranges...and explaining *that* the next person that needs to maintain your code is going to be ..... fun :-)

I'd perhaps look at some granular 2-level partitioning on start and end keys. The indexing then gives you fast single row lookups, and the partitioning give you fast range lookups.