I received a request to build a table that looks something like:
create table META_TABLE (
TABLE_ITEM int constraint META_TABLE_PK primary key not null,
ITEM_TYPE char(30) not null, -- service, retail, maintenance
CREATE_DATE date not null,
PARTY1_ID int not null,
PARTY2_ID int not null,
DESCRIPTION char(200) not null,
STATUS char(30) not null,
SVC_TYPE char(30) null,
SVC_HOURS number null,
MILEAGE number null,
PRODUCT_ID int null,
QUANTITY int null,
TOTAL_SALE number(10,4) null,
EQUIP_ID int null,
MAINT_TYPE char(30) null
);
Note the column lists are for illustration; we're in healthcare and I didn't want to use esoteric terms like ICD10, SNOMED, HCPCS, etc. The application if primarily OLTP, not OLAP/BI. Several columns are used only for specific values of "ITEM_TYPE." A set of inserts would look like:
insert into META_TABLE
select 1, 'SERVICE', sysdate, 11, 6001, 'Service example 1', 'SCHEDULED', 'HVAC', 3.5, 16, null, null, null, null, null from dual
union
select 2, 'SERVICE', sysdate, 37, 1203, 'Service example 2', 'COMPLETE', 'Pool', 2, 9, null, null, null, null, null from dual
union
select 3, 'RETAIL', sysdate, 16, 9600, 'Retail example 1', 'COMPLETE', null, null, null, 654321, 3, 87.52, null, null from dual
union
select 4, 'RETAIL', sysdate, 203, 1107, 'Retail example 2', 'COMPLETE', null, null, null, 2468, 11, 23.45, null, null from dual
union
select 5, 'MAINTENANCE', sysdate, 102, 0, 'Maintenance example 1', 'SCHEDULED', null, null, null, null, null, null, 16, 'PM' from dual
union
select 6, 'MAINTENANCE', sysdate, 102, 0, 'Maintenance2', 'IN PROGESS', null, null, null, null, null, null, 17, 'CLEANING' from dual ;
I would prefer to build four tables that are like:
create table ITEM_TABLE (
TABLE_ITEM int constraint ITEM_TABLE_PK primary key not null,
ITEM_TYPE char(30) not null,
CREATE_DATE date not null,
PARTY1_ID int not null,
PARTY2_ID int not null,
DESCRIPTION char(200) not null,
STATUS char(30) not null
);
create table ITEM_SERVICE_TABLE (
SERVICE_TABLE_ITEM int constraint ITEM_SERVICE_TABLE_PK primary key not null,
TABLE_ITEM int not null,
SVC_TYPE char(30) not null,
SVC_HOURS number not null,
MILEAGE number not null,
constraint TABLE_ITEM_FK1 foreign key (TABLE_ITEM) references ITEM_TABLE (TABLE_ITEM)
);
create table ITEM_RETAIL_TABLE (
RETAIL_TABLE_ITEM int constraint ITEM_RETAIL_TABLE_PK primary key not null,
TABLE_ITEM int not null,
PRODUCT_ID int not null,
QUANTITY int not null,
TOTAL_SALE number(10,4) not null,
constraint TABLE_ITEM_FK2 foreign key (TABLE_ITEM) references ITEM_TABLE (TABLE_ITEM)
);
create table ITEM_MAINTENANCE_TABLE (
MAINTENANCE_TABLE_ITEM int constraint ITEM_MAINTENANCE_TABLE_PK primary key not null,
TABLE_ITEM int not null,
EQUIP_ID int not null,
MAINT_TYPE char(30) not null,
constraint TABLE_ITEM_FK3 foreign key (TABLE_ITEM) references ITEM_TABLE (TABLE_ITEM)
);
I know the proposed model is not in 2NF (and therefore not in 3NF), and I can explain this in abstract/academic terms, but I'm not sure how to articulate in business language why this is sub-optimal. Can you give me a some ammunition, or do you know a good resource?
Thanx,
Don
Well, as meta_table allows nulls, arguably it isn't even in 1NF!
I'm not sure you do have a 2NF violation though. How do you know this table isn't in 2NF? What are the candidate keys? And which non-key columns depend on a subset of the columns of a candidate key?
In any case, the single table still has several drawbacks.
Ensuring the correct columns are setSo I'm guessing that ITEM_TYPE determines which other columns are not null. Which means you need complex check constraints validating the correct ones are set along the lines of:
case when item_type = 'SERVICE' then svc_type end is not null and
case when item_type = 'SERVICE' then svc_hours end is not null and
case when item_type = 'SERVICE' then coalesce(product_id, quantity, total_sale, etc) end is null
This is easy to get wrong. And a faff to update when you add new items or attributes. And a real headache if any of the attributes are optional for a given type.
Query resultsYou need to ensure that all your queries only select the appropriate type and its associated columns. This is easy to overlook.
With the child tables either you're selecting from just that table or joining back to the parent. In either case, it's less likely you'll accidentally fetch incorrect items.
Referential integrityWill any of the types have child tables that are specific to itself? e.g. will SERVICE items have rows that can only be their children, not RETAILS? If so you need to go to some effort to ensure the FKs prevent this. This involves having a unique constraint including the type and copying that column to the child. With the separate child tables you just make the FK as normal.
Though arguably you should do this on your item* tables to ensure they only store rows for the correct parent.
PerformanceWith one big table queries that do full scans, but only return one item type have more (unnecessary) data to process. Of course this could go the other way too. If you'll regularly fetch rows for more than one item type the overhead of joining could slow you down. Ultimately this is something you need to test in your application.
Also: what's with using char? You almost certainly want to use varchar2, unless you like dealing with blank padded strings:
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2668391900346844476