Thanks for the question, Boris.

Asked: December 14, 2017 - 3:08 pm UTC

Last updated: January 10, 2018 - 12:03 am UTC

Version: 11

Viewed 10K+ times! This question is

You Asked

one of our Oracle data sources has hundreds of tables with all numeric columns defined using NUMBER data type without precision and scale. But in fact, a column can store pure integer values or decimal values - there is no way to tell that by looking at data type alone. Now, this is a big problem because when I try to load any of this data to Big Data tools (sqoop, hive, spark etc.) all these tools treat these columns as strings and this is a super painful issue to deal with. I think it is an issue with all jdbc/java based tools.

My question is it possible to detect somehow the actual precision and scale of values stored in a column with NUMBER type. My hope Oracle keeps this somewhere in metadata tables but I do not see that anywhere. My last resort is to do a random sampling of data in a table and store conversion schema on a side but I hope there is a better way. I really do

example: tableA column ID has type NUMBER, column amount type NUMBER and column quantity type NUMBER - no precision or scale specified.

But in fact ID should be bigint, amount should be decimal(18,6) and quantity should be int.

I cannot possibly do the mapping manually because I have 600 tables and each table has 50-300 columns. Data sampling is my last resort.

Thanks!

and Connor said...

I'm lost. Without data sampling how do you expect to know anything about a column defined as NUMBER other than it could contain *any* kind of NUMBER ? Other than the limits of the NUMBER datatype

https://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#CHDHDHGB

there is nothing we can see without looking at the data. You could do something like this:

SQL> create table t ( x number );

Table created.

SQL>
SQL> insert into t values (123);

1 row created.

SQL> insert into t values (123123);

1 row created.

SQL> insert into t values (123.123);

1 row created.

SQL> insert into t values (123.123123123);

1 row created.

SQL> insert into t values (123.123123123123123);

1 row created.

SQL> insert into t values (-12.12312312);

1 row created.

SQL> insert into t values (12.123123123456345643745867);

1 row created.

SQL> insert into t values (12312312313123);

1 row created.

SQL> insert into t values (0.678);

1 row created.

SQL> insert into t values (0.678765675678);

1 row created.

SQL> insert into t values (0.1);

1 row created.

SQL>
SQL> select
  2    length(to_char(trunc(max(abs(x))))) digits,
  3    max(length(to_char(abs(x-trunc(x)))))-1 decimals
  4  from t;

    DIGITS   DECIMALS
---------- ----------
        14         24

1 row selected.

to get a reasonable approximation, and that is easy enough to craft into something dynamic, eg

SQL> set serverout on
SQL> declare
  2    l_template varchar2(1000) := 'select length(to_char(trunc(max(abs(@@COL))))) digits, max(length(to_char(abs(@@COL-trunc(@@COL)))))-1 decimals from @@TAB';
  3    l_sql varchar2(1000);
  4    l_dig int;
  5    l_dec int;
  6  begin
  7    for i in ( select owner, table_name, column_name
  8               from dba_tab_cols
  9               where data_type in ('NUMBER','FLOAT')
 10               and   owner = 'SCOTT'
 11             )
 12    loop
 13      l_sql := replace(replace(l_template,'@@COL',i.column_name),'@@TAB',i.owner||'.'||i.table_name);
 14      execute immediate l_sql into l_dig, l_dec;
 15      dbms_output.put_line(i.owner||'.'||i.table_name||'.'||i.column_name||'='||l_dig||','||l_dec);
 16    end loop;
 17  end;
 18  /
SCOTT.EMP.EMPNO=4,0
SCOTT.EMP.MGR=4,0
SCOTT.EMP.SAL=4,0
SCOTT.EMP.COMM=4,0
SCOTT.EMP.DEPTNO=2,0
SCOTT.BONUS.SAL=,
SCOTT.BONUS.COMM=,
SCOTT.DEPT.DEPTNO=2,0

easily adjusted for either subsets of data etc.

Rating

(2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Another option

Dan, January 08, 2018 - 9:22 pm UTC

If you have stats on the tables, you could see if the LOW_VALUE and HIGH_VALUE help. You would need to decode that value from a RAW to a human readable number. One way to do it is here https://mwidlake.wordpress.com/2010/01/03/decoding-high_value-and-low_value/ That won't tell you anything about the precision, but will help with scale. Once you have the min/max values, you can use log10 to find the actual number of digits.

January 10, 2018 - 12:03 am UTC

Nice input.

A reader, February 04, 2019 - 7:02 pm UTC

we ended up doing this, I wish there were an easier way. low/high on statistics table was no use for us since in our case it would always show whole numbers as low/high value.

https://stackoverflow.com/questions/54334854/oracle-number-type-without-precision-how-do-i-know-if-it-is-a-whole-number-or

More to Explore

Design

New to good database design? Check out Chris Saxon's full fundamentals class.

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

Another option

More to Explore

Design

how to infer proper precision and scale from NUMBER data type columns without precision and scale

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

Another option

More to Explore

Design