Hello!
i am interested in Building a NN using Oracle DBMS_DATA_MINING, however i am facing some technical difficulties Building such complex model.
Main Problem is the Need to Combine multiple columns in the Analysis.
I will provide an example on what i am Looking to achieve:
____________________________________________________________
-- Create a base table to score and label data, we will create a NN to Analyse customers purchases in order to find Patterns in data.
create table customer_group_analysis ( age number(4), City varchar2(24), education_level varchar2(24), total_purchase number(24) );
insert into customer_group_analysis values (18,'London','Student', 400);
insert into customer_group_analysis values (28,'Leeds','Bachelor Degree', 1400);
insert into customer_group_analysis values (54,'Liverpool','Master Degree', 2400);
insert into customer_group_analysis values (16,'London','Student', 150);
insert into customer_group_analysis values (33,'London','PHD', 3400);
insert into customer_group_analysis values (28,'Leeds','Bachelor Degree', 2000);
commit;
-- Create Settings table for NN
create table demo_nn_settings (Setting_name varchar2(30), Setting_value varchar2(4000));
begin
insert into demo_nn_settings (setting_name, setting_value)
values (dbms_data_mining.algo_name, dbms_data_mining.algo_neural_network);
insert into demo_nn_settings (setting_name, setting_value)
values (dbms_data_mining.prep_auto, dbms_data_mining.prep_auto_on);
insert into demo_nn_settings (setting_name, setting_value)
values (dbms_data_mining.nnet_nodes_per_layer, '10,6,4');
end;
/
commit;
-- Create the Model
BEGIN
DBMS_DATA_MINING.CREATE_MODEL(
model_name => 'DEMO_NEURAL_NETWORK_MODEL',
mining_function => dbms_data_mining.classification,
data_table_name => 'customer_group_analysis',
case_id_column_name => 'age,city,education', -- This will throw an error! but we Need that the case_id_columns contains all required columns for the Analysis
target_column_name => 'total_purchase',
settings_table_name => 'demo_nn_settings');
END;
/
-- Create view to label the data
CREATE OR REPLACE VIEW NN_VIEW
AS
SELECT age,city,education,
prediction(DEMO_NEURAL_NETWORK_MODEL USING *) predicted_value,
prediction_probability(DEMO_NEURAL_NETWORK_MODEL USING *) probability
FROM customer_group_analysis;
-- Create the confusion Matrix
DECLARE
v_accuracy NUMBER;
BEGIN
DBMS_DATA_MINING.COMPUTE_CONFUSION_MATRIX (
accuracy => v_accuracy,
apply_result_table_name => 'NN_VIEW',
target_table_name => 'customer_group_analysis',
case_id_column_name => 'age,city,education',
target_column_name => 'total_purchase',
confusion_matrix_table_name => 'NN_VIEW_confusion_matrix',
score_column_name => 'PREDICTED_VALUE',
score_criterion_column_name => 'PROBABILITY',
cost_matrix_table_name => null,
apply_result_schema_name => null,
target_schema_name => null,
cost_matrix_schema_name => null,
score_criterion_type => 'PROBABILITY');
DBMS_OUTPUT.PUT_LINE('**** MODEL ACCURACY ****: ' || ROUND(v_accuracy,4));
END;
/
The creation of the model will fail because I am defining three columns in the "case_id_column_name".
Unfortunately I couldn't find a way to solve this issue, could you please help me to define more than one columns in the "case_id_column_name" in the Definition of the model?
I appreciate your help!!
Regards
The case_id_column_name is a unique identifier for each customer group. This doesn't have to be unique in the table, but you want all rows which are part of the same case (customer group) to have the same value.
An easy way do to this is to create a view which includes a rank/dense_rank column and pass this to the model instead of the base table.
For example:
create or replace view cust_cases as
select rank() over ( order by age, city, education_level ) case_id,
c.*
from customer_group_analysis c;
select * from cust_cases;
CASE_ID AGE CITY EDUCATION_LEVEL TOTAL_PURCHASE
1 16 London Student 150
2 18 London Student 400
3 28 Leeds Bachelor Degree 1400
3 28 Leeds Bachelor Degree 2000
5 33 London PHD 3400
6 54 Liverpool Master Degree 2400
The downside of this is the case_ids will change if you add/remove rows from the table. The alternative is to add a case_id column to the table itself, then set it.
As you have duplicate rows for (age, city, education), this could be tricky to manage.
You could make this easier by splitting the unique attributes to a header table and put the other attributes into a child table.
For example:
create table customer_group_header (
case_id integer
generated as identity
primary key,
age number(4), City varchar2(24), education_level varchar2(24),
unique ( age, city, education_level )
);
create table customer_group_details (
case_id references customer_group_header ( case_id ),
total_purchase number(24)
);
And again build a view joining these two and pass that to the model.