Skip to Main Content
  • Questions
  • Building a Neural Network Machine Leaning Model using Oracle

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Ahmad.

Asked: February 10, 2021 - 9:37 am UTC

Last updated: February 24, 2021 - 9:05 am UTC

Version: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0

Viewed 1000+ times

You Asked

Hello!

i am interested in Building a NN using Oracle DBMS_DATA_MINING, however i am facing some technical difficulties Building such complex model.
Main Problem is the Need to Combine multiple columns in the Analysis.

I will provide an example on what i am Looking to achieve:
____________________________________________________________

-- Create a base table to score and label data, we will create a NN to Analyse customers purchases in order to find Patterns in data. 
create table customer_group_analysis ( age number(4), City varchar2(24), education_level varchar2(24), total_purchase number(24) );

insert into customer_group_analysis values (18,'London','Student', 400);
insert into customer_group_analysis values (28,'Leeds','Bachelor Degree', 1400);
insert into customer_group_analysis values (54,'Liverpool','Master Degree', 2400);
insert into customer_group_analysis values (16,'London','Student', 150);
insert into customer_group_analysis values (33,'London','PHD', 3400);
insert into customer_group_analysis values (28,'Leeds','Bachelor Degree', 2000);

commit;

-- Create Settings table for NN

create table demo_nn_settings (Setting_name varchar2(30), Setting_value varchar2(4000));

begin
  insert into demo_nn_settings (setting_name, setting_value)
  values (dbms_data_mining.algo_name, dbms_data_mining.algo_neural_network);
  
  insert into demo_nn_settings (setting_name, setting_value)
  values (dbms_data_mining.prep_auto, dbms_data_mining.prep_auto_on);
  
  insert into demo_nn_settings (setting_name, setting_value)
  values (dbms_data_mining.nnet_nodes_per_layer, '10,6,4');
end;
/
commit;

-- Create the Model 

BEGIN
   DBMS_DATA_MINING.CREATE_MODEL(
      model_name          => 'DEMO_NEURAL_NETWORK_MODEL',
      mining_function     =>  dbms_data_mining.classification,
      data_table_name     => 'customer_group_analysis',
      case_id_column_name => 'age,city,education',   -- This will throw an error! but we Need that the case_id_columns contains all required columns for the Analysis 
      target_column_name  => 'total_purchase',
      settings_table_name => 'demo_nn_settings');
END;
/
 
-- Create view to label the data

CREATE OR REPLACE VIEW NN_VIEW
AS
SELECT age,city,education,
       prediction(DEMO_NEURAL_NETWORK_MODEL USING *)  predicted_value,
       prediction_probability(DEMO_NEURAL_NETWORK_MODEL USING *) probability
FROM   customer_group_analysis;

-- Create the confusion Matrix
DECLARE
   v_accuracy NUMBER;
BEGIN
   DBMS_DATA_MINING.COMPUTE_CONFUSION_MATRIX (
      accuracy           => v_accuracy,
      apply_result_table_name      => 'NN_VIEW',
      target_table_name       => 'customer_group_analysis',
      case_id_column_name       => 'age,city,education',
      target_column_name       => 'total_purchase',
      confusion_matrix_table_name => 'NN_VIEW_confusion_matrix',
      score_column_name       => 'PREDICTED_VALUE',
      score_criterion_column_name => 'PROBABILITY',
      cost_matrix_table_name      => null,
      apply_result_schema_name    => null,
      target_schema_name       => null,
      cost_matrix_schema_name     => null,
      score_criterion_type       => 'PROBABILITY');
   DBMS_OUTPUT.PUT_LINE('**** MODEL ACCURACY ****: ' || ROUND(v_accuracy,4));
END;
/


The creation of the model will fail because I am defining three columns in the "case_id_column_name".

Unfortunately I couldn't find a way to solve this issue, could you please help me to define more than one columns in the "case_id_column_name" in the Definition of the model?

I appreciate your help!!


Regards

and Chris said...

The case_id_column_name is a unique identifier for each customer group. This doesn't have to be unique in the table, but you want all rows which are part of the same case (customer group) to have the same value.

An easy way do to this is to create a view which includes a rank/dense_rank column and pass this to the model instead of the base table.

For example:

create or replace view cust_cases as
  select rank() over ( order by age, city, education_level ) case_id,
         c.*
  from   customer_group_analysis c;

select * from cust_cases;

CASE_ID    AGE    CITY         EDUCATION_LEVEL       TOTAL_PURCHASE   
         1     16 London       Student                          150 
         2     18 London       Student                          400 
         3     28 Leeds        Bachelor Degree                 1400 
         3     28 Leeds        Bachelor Degree                 2000 
         5     33 London       PHD                             3400 
         6     54 Liverpool    Master Degree                   2400 


The downside of this is the case_ids will change if you add/remove rows from the table. The alternative is to add a case_id column to the table itself, then set it.

As you have duplicate rows for (age, city, education), this could be tricky to manage.

You could make this easier by splitting the unique attributes to a header table and put the other attributes into a child table.

For example:

create table customer_group_header ( 
  case_id integer 
    generated as identity
    primary key,
  age    number(4), City varchar2(24), education_level varchar2(24),
  unique ( age, city, education_level )
);

create table customer_group_details ( 
  case_id references customer_group_header ( case_id ),
  total_purchase number(24) 
);


And again build a view joining these two and pass that to the model.

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Reply

Ahmad, February 19, 2021 - 7:54 am UTC

Great! thank you for the tip.

I have one last Question, perhaps you have an idea…

"target_column_name" is the target column in the targets table. which contains the known target values from the test data. but also the column which the model should predict, however what if i want that the model predicts multiple columns, is it possible to Combine multiple columns as "target_column_name" and pass them to the model ?

Thx

Chris Saxon
February 19, 2021 - 9:48 am UTC

You can only specify a single column here. All the other columns are inputs to predict that value.

It may be that you need to build a different kind of model to do what you want. I only know the basics of ML though, so I'm not sure what you'd need to do instead

Reply

Ahmad, February 23, 2021 - 4:13 pm UTC

Hi,

i have a Question regarding your suggested answer to this issue, for the source datasets i have splitted the unique attributes to a Header table and put the other Attributes in a child table.

I have a csv file (training datasets) which contains also duplicate rows, should i also split the Training data in Header and child tables or no Need ?

Thx
Chris Saxon
February 24, 2021 - 9:05 am UTC

You need to assign all rows with the same values for your multi-column case identifier the same single-column case identifier.

Splitting these into header + child tables is a good way to do this

So the question to you is: is that the problem you're trying to solve? If not, what do you mean?

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.