one of our Oracle data sources has hundreds of tables with all numeric columns defined using NUMBER data type without precision and scale. But in fact, a column can store pure integer values or decimal values - there is no way to tell that by looking at data type alone. Now, this is a big problem because when I try to load any of this data to Big Data tools (sqoop, hive, spark etc.) all these tools treat these columns as strings and this is a super painful issue to deal with. I think it is an issue with all jdbc/java based tools.
My question is it possible to detect somehow the actual precision and scale of values stored in a column with NUMBER type. My hope Oracle keeps this somewhere in metadata tables but I do not see that anywhere. My last resort is to do a random sampling of data in a table and store conversion schema on a side but I hope there is a better way. I really do
example: tableA column ID has type NUMBER, column amount type NUMBER and column quantity type NUMBER - no precision or scale specified.
But in fact ID should be bigint, amount should be decimal(18,6) and quantity should be int.
I cannot possibly do the mapping manually because I have 600 tables and each table has 50-300 columns. Data sampling is my last resort.
Thanks!
I'm lost. Without data sampling how do you expect to know anything about a column defined as NUMBER other than it could contain *any* kind of NUMBER ? Other than the limits of the NUMBER datatype
https://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#CHDHDHGB there is nothing we can see without looking at the data. You could do something like this:
SQL> create table t ( x number );
Table created.
SQL>
SQL> insert into t values (123);
1 row created.
SQL> insert into t values (123123);
1 row created.
SQL> insert into t values (123.123);
1 row created.
SQL> insert into t values (123.123123123);
1 row created.
SQL> insert into t values (123.123123123123123);
1 row created.
SQL> insert into t values (-12.12312312);
1 row created.
SQL> insert into t values (12.123123123456345643745867);
1 row created.
SQL> insert into t values (12312312313123);
1 row created.
SQL> insert into t values (0.678);
1 row created.
SQL> insert into t values (0.678765675678);
1 row created.
SQL> insert into t values (0.1);
1 row created.
SQL>
SQL> select
2 length(to_char(trunc(max(abs(x))))) digits,
3 max(length(to_char(abs(x-trunc(x)))))-1 decimals
4 from t;
DIGITS DECIMALS
---------- ----------
14 24
1 row selected.
to get a reasonable approximation, and that is easy enough to craft into something dynamic, eg
SQL> set serverout on
SQL> declare
2 l_template varchar2(1000) := 'select length(to_char(trunc(max(abs(@@COL))))) digits, max(length(to_char(abs(@@COL-trunc(@@COL)))))-1 decimals from @@TAB';
3 l_sql varchar2(1000);
4 l_dig int;
5 l_dec int;
6 begin
7 for i in ( select owner, table_name, column_name
8 from dba_tab_cols
9 where data_type in ('NUMBER','FLOAT')
10 and owner = 'SCOTT'
11 )
12 loop
13 l_sql := replace(replace(l_template,'@@COL',i.column_name),'@@TAB',i.owner||'.'||i.table_name);
14 execute immediate l_sql into l_dig, l_dec;
15 dbms_output.put_line(i.owner||'.'||i.table_name||'.'||i.column_name||'='||l_dig||','||l_dec);
16 end loop;
17 end;
18 /
SCOTT.EMP.EMPNO=4,0
SCOTT.EMP.MGR=4,0
SCOTT.EMP.SAL=4,0
SCOTT.EMP.COMM=4,0
SCOTT.EMP.DEPTNO=2,0
SCOTT.BONUS.SAL=,
SCOTT.BONUS.COMM=,
SCOTT.DEPT.DEPTNO=2,0
easily adjusted for either subsets of data etc.