Hi Answers Team.
I keep seeing my maintenance / design / development teams abusing dates everywhere. One example is the use of YYYYMM as a way to extract a month from some piece of data what starts out as a date.
1. database design will create a column like
REPORTED_MONTH NUMBER
;
2.
TO_CHAR(reported_date,'YYYYMM')
will be used to fill this column with values.
3. to extract rows for some date range (like a particular quarter) will use code like this:
AND REPORTED_MONTH BETWEEN 200101 AND 200103;
There is so much wrong with this that it makes me cry. But the costs of redesign and refactoring and retesting seems to large that I am looking for a possible alternative to dealing with this issue.
Currently we do things like this rewriting the query predicates, and collecting extended stats as seen here:
1.
and to_date(reported_month,'YYYYMM') between to_date('200101','YYYYMM') and to_date('200103','YYYYMM'); 2. collect extended stats on
to_date(reported_month','YYYYMM')
This at least appears to fix issues like some cardinality estimations in query plans by giving back to the database, the understanding that these values are dates and so DATE MATH applies to them, and give the database basic stats and if needed histograms on the actual expression in order to handle conditions like out-of-bounds and range queries.
But i wonder if it would be worth the time to create a new data type to express the idea of MONTH.
1. What would be required to create a user defined type that allowed for this shortened date info YYYYMM.
2. How would this type enforce basic date concepts like month <=12.
3. What would be needed in the type definition to ensure that cardinality estimations understood DATE MATH was applicable to these items.
4 Or is this just a stupid idea that would only have me expending a long amount of time.
5. You got any advice?
We already do! It's called a year to month interval:
http://docs.oracle.com/cd/E11882_01/server.112/e41084/sql_elements001.htm#SQLRF00206 It's intended to show a duration of time instead of a fixed date. But you could use it for your needs.
It understands date math (though only at the year/month level). And ensures you can't store invalid months:
create table t (
rep_month interval year(4) to month
);
insert into t values (interval '2001-01' year(4) to month);
insert into t values (interval '2001-06' year(4) to month);
insert into t values (interval '2016-01' year(4) to month);
insert into t values (interval '2016-13' year(4) to month);
SQL Error: ORA-01843: not a valid month
select rep_month,
rep_month-lag(rep_month) over (order by rep_month) diff
from t;
REP_MONTH DIFF
+2001-01
+2001-06 +00-05
+2016-01 +14-07
select *
from t
where rep_month between interval '2015-01' year(4) to month
and interval '2016-06' year(4) to month;
REP_MONTH
+2016-01