Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, THRG.

Asked: July 26, 2009 - 2:03 pm UTC

Last updated: July 26, 2009 - 10:18 pm UTC

Version: 10g,11g

Viewed 1000+ times

You Asked

Hi Tom,

I am looking for a efficient way of matching data for following scenario,my data is organized in this manner,(more of key value pair - i know you hate this type of design)

Table x
uid object_id attribute_id value
1 100 ATT1 10
2 100 ATT2 TRUE
3 312 ATT1 5
4 312 ATT2 FALSE


in a scenario one , I need to match this with a following row type request

Request
========
object_id att1 value1 att2 value2 att3 value3 att4 value4 ...
100 ATT1 10 ATT2 6 null null null null ....

so how do i find the best match against the above type of row request using SQL with out going for a procedural approach ?

in my second scenario , if I am to match similar type (pattern of attribute and value pairs) of attributes and to calculate a particular value by summing up , how do I do it in a SQL ?

object id attribute value
100 att1 100
100 att2 TRUE
100 att3 UP
.
.
300 att1 100
300 att2 TRUE
300 att3 UP
.
.

so if the data is like above attributes & values of object id 100 and 300 matches so they can be grouped into a one category. In addition lets say that i would like to some-up att1 value 100 ..so I would get 200.

in certain cases i also have to group by only the attribute but not value as well.

Is there a term that is used for this type of/class of problems ?

(I didn't have access to a oracle db while i wrote this ..so please forgive me if i have any mistakes in my samples bellow)

currently i thought of doing pivot / unpivot operations but the dynamic nature of these classes of attributes makes most part of the query to be dynamic.

Thanks

SQL
====

create table x
(uid number primary key,
 object_id  number,
 attribute_id varchar2(50),
 value varchar2(100)
);

insert into x (uid,object_id,attribute_id,value) values(1,100,'ATT1','10');
insert into x (uid,object_id,attribute_id,value) values(2,100,'ATT2','TRUE');
insert into x (uid,object_id,attribute_id,value) values(3,312,'ATT1','5');
insert into x (uid,object_id,attribute_id,value) values(4,312,'ATT2','FALSE');

insert into x (uid,object_id,attribute_id,value) values(5,300,'ATT1','10');
insert into x (uid,object_id,attribute_id,value) values(6,300,'ATT2','TRUE');

insert into x (uid,object_id,attribute_id,value) values(7,400,'ATT1','10');
insert into x (uid,object_id,attribute_id,value) values(8,400,'ATT2','TRUE');
insert into x (uid,object_id,attribute_id,value) values(9,400,'ATT3','3.4');
insert into x (uid,object_id,attribute_id,value) values(10,400,'ATT4','STND');




and Tom said...

... i know you hate this type of design)


for good reasons (plural)

... o how do i find the best match against the above type of row request using SQL with out going for a procedural approach ...

and there is one of them now!!!!! check it out - you have discovered problem #143212 with this storage 'technique'

it is very very very secure....

you can put data in

but getting it back out is a bear.


Here is a snippet from Effective Oracle by Design. It answers your question but also points out the fact that *this will never perform in real life*

Time and time and time again, I see it done - over and over and over. And now you'll tell me "well, we have this sunk cost, we have to use this model". You are doomed to excruciating response times - you have doomed your end users to endure this. And ultimately, the 'architect' behind this will blame the database for poor performance.

<quote src=Effective Oracle by Design>

Don¿t Use Generic Data Models

Frequently, I see applications built on a generic data model for ¿maximum flexibility¿ or applications built in ways that prohibit performance. Many times, these are one and the same thing! For example, it is well known you can represent any object in a database using just four tables:

Create table objects ( oid int primary key, name varchar2(255) );
Create table attributes
( attrId int primary key, attrName varchar2(255),
datatype varchar2(25) );
Create table object_Attributes
( oid int, attrId int, value varchar2(4000),
primary key(oid,attrId) );
Create table Links ( oid1 int, oid2 int,
primary key (oid1, oid2) );


That¿s it. No more CREATE TABLE for me! I can fill the ATTRIBUTES table with rows like this:

insert into attributes values ( 1, 'DATE_OF_BIRTH', 'DATE' );
insert into attributes values ( 2, 'FIRST_NAME',    'STRING' );
insert into attributes values ( 3, 'LAST_NAME',    'STRING' );
commit;


And now I¿m ready to create a PERSON record:
insert into objects values ( 1, 'PERSON' );
insert into object_Attributes values( 1, 1, '15-mar-1965' );
insert into object_Attributes values( 1, 2, 'Thomas' );
insert into object_Attributes values( 1, 3, 'Kyte' );
commit;
insert into objects values ( 2, 'PERSON' );
insert into object_Attributes values( 2, 1, '21-oct-1968' );
insert into object_Attributes values( 2, 2, 'John' );
insert into object_Attributes values( 2, 3, 'Smith' );
commit;


And since I¿m good at SQL, I can even query this record to get the FIRST_NAME and LAST_NAME of all PERSON records:

ops$tkyte@ORA920> select 
     max( decode(attrName, 'FIRST_NAME', value, null )) first_name,
  2  max( decode( attrName, 'LAST_NAME',  value, null ) ) last_name
  3    from objects, object_attributes, attributes
  4   where attributes.attrName in ( 'FIRST_NAME', 'LAST_NAME' )
  5     and object_attributes.attrId = attributes.attrId
  6     and object_attributes.oid = objects.oid
  7     and objects.name = 'PERSON'
  8   group by objects.oid
  9  /

FIRST_NAME           LAST_NAME
-------------------- --------------------
Thomas               Kyte
John                 Smith


Looks great, right? I don¿t need to create tables anymore, because I can add columns at the drop of a hat (with an insert into the ATTRIBUTES table). The developers can do whatever they want, and the DBA can¿t stop them. This is ultimate flexibility. I¿ve seen people try to build entire systems based on this model.

But how does this model perform? Miserably, terribly, and horribly. A simple select first_name, last_name from person query is transformed into a three-table join with aggregates and all. Furthermore, if the attributes are NULLABLE¿that is, there might not be a row in OBJECT_ATTRIBUTES for some attributes¿you may need to use an outer join instead of just joining, which might remove more optimal query plans from consideration.

Writing queries with this model might look straightforward. For example, if I wanted to get everyone who was born in March or has the last name of Smith, I could simply take the query to get the FIRST_NAME and LAST_NAME of all PERSON records and wrap an inline view around it:

ops$tkyte@ORA920> select *
  2    from (
  3  select
     max(decode(attrName, 'FIRST_NAME', value, null)) first_name,
  4  max(decode(attrName, 'LAST_NAME',  value, null)) last_name,
  5  max(decode(attrName, 'DATE_OF_BIRTH',  value, null))
                                                      date_of_birth
  6    from objects, object_attributes, attributes
  7   where attributes.attrName in ( 'FIRST_NAME',
                                     'LAST_NAME', 'DATE_OF_BIRTH' )
  8     and object_attributes.attrId = attributes.attrId
  9     and object_attributes.oid = objects.oid
 10     and objects.name = 'PERSON'
 11   group by objects.oid
 12         )
 13   where last_name = 'Smith'
 14      or date_of_birth like '%-mar-%'
 15  /

FIRST_NAME           LAST_NAME            DATE_OF_BIRTH
-------------------- -------------------- --------------------
Thomas               Kyte                 15-mar-1965
John                 Smith                21-oct-1968


So, it looks easy to query, but think about the performance! If you had a couple thousand OBJECT records and a couple tens of thousands of OBJECT_ATTRIBUTES, Oracle would need to process the entire inner group by query first and then apply the WHERE clause.

This is not a made-up data model, one that I crafted just to make a point. This is an actual data model that I¿ve seen people try to use. Their goal is ultimate flexibility. They don¿t know what OBJECTS they need, and they don¿t know what ATTRIBUTES they will have. Well, that is what the database was written for in the first place: Oracle implemented this thing called SQL to define OBJECTS and ATTRIBUTES and lets you use SQL to query them. People who use a data model like this are trying to put a generic layer on top of a generic layer, and it fails each and every time, except in the most trivial of applications.

</quote>

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

what if column and column matching

Shantanu, August 20, 2009 - 7:15 am UTC

HiTom!

the solution is as usual excelent.

But like here if the data is feed in columns only.

object_id att1 value1 att2 value2 att3 value3 att4
100 ATT1 10 ATT2 6 null null null null .

I have total profit which is feed in columns for 104 weeks

week1 week2 week3 week4 week5 week6
20 27 30 32 15 12

and in every new week it will again show the data of last 104 weeks only. So the data will be like

week1 week2 week3 week4 week5 week6
37 20 27 30 32 15

if by mistake data shown for current week is not matching with the data shown for the last week.

then How can I match the data for last and current week and find which week data is giving different values.