Skip to Main Content

Oracle Machine Learning Office Hours

Free tips and training every month! Subscribe for reminders and more from Office Hours. FAQ

Header container

December 17

16:00 UTC   Start Times Around the World


Machine Learning 101: Feature Extraction
Have you always been curious about what machine learning can do for your business problem, but could never find the time to learn the practical necessary skills? Do you wish to learn what Classification, Regression, Clustering and Feature Extraction techniques do, and how to apply them using the Oracle Machine Learning family of products?

Join us for this special series “Oracle Machine Learning Office Hours – Machine Learning 101”, where we will go through the main steps of solving a Business Problem from beginning to end, using the different components available in Oracle Machine Learning: programming languages and interfaces, including Notebooks with SQL, UI, and languages like R and Python.

This seventh session in the series covered Extraction 101, and we learned about the methods to extract meaningful attributes from  a large number of columns in datasets, explore Dimensionality Reduction and how it can be beneficial as a pre-processing for Machine Learning models.

Video Highlights
00:53 Oracle Machine Learning Office Hours - next Session
01:59 Machine Learning 101 - Feature Extraction
02:20 Feature Extraction 101 - Introduction
04:20 Feature Selection
06:28 Feature Extraction algorithms
08:25 Attribute Importance
10:05 Singular Value Decomposition (SVD)
11:24 Non-negative Matrix Factorization (NMF)
12:30 Intuition on NMF
15:10 Explicit Semantic Analysis (ESA)
15:47 Feature Extraction 101 Demo
19:32 Explore the data
18:18 Data used for the Demo
20:03 Basic visualization on the data
21:05 Build Attribute Importance (AI) model
23:30 Logistic Regression: Full Model vs. Feature Selection via AI
26:43 Build Singular Value Decomposition (SVD) model
30:00 Relationship between SVD feature vectors and attributes
33:16 Logistic Regression: Full Model vs. Feature Selection via SVD
34:55 Create Principal Components (PCA) projections for test data using the SVD model
35:55 Plot 3-D of top 3 attributes on original data vs. Projected PCA components
39:50 Build Non-negative Matrix Factorization (NMF) model via PL/SQL
41:45 Relationship between SVD feature vectors and attributes
43:20 Logistic Regression: Full Model vs. Feature Selection via NMF
44:44 Create NMF projections for test data using the NMF model
45:10 Plot 3-D of top 3 attributes on original data vs. Projected NMF features
46:54 Comparison with Build Logistic Regression using the built-in Feature Selection
49:20 Comparison with AutoML: Auto Algorithm Selection, Auto Feature Selection and Auto Tune for Logistic Regression
51:15 Q&A

Subscribe to be notified of changes to sessions and give us feedback!

Having trouble watching the video on this page? Open the video in your browser.

Your Experts

Marcos Arancibia
Marcos Arancibia, Product Manager, Data Science and Big Data    
Marcos Arancibia is the Product Manager for Oracle Data Science and Big Data. He works with Machine Learning in the Oracle Database and on Big Data clusters under Hadoop and Spark, on premises and in the Oracle Cloud. He works within Product Management to develop product strategy, roadmap prioritization, product positioning and product evangelization, working closely with the engineering team in defining the product roadmaps for Oracle Machine Learning and Big Data in the Cloud. Before joining Oracle 9 years ago he was at SAS Institute Inc. for 13 years as a Data Mining architect and expert in the US and Latin America. He holds a Bachelor Degree of Science in Statistics with additional courses in the Master of Science in Statistics, both from UNICAMP in Brazil. He has Certifications from Stanford on AI and Machine Learning, and from the University of Washington on Computational Neuroscience. He is an expert on Deep Learning and passionate about Machine Learning.
Mark Hornick
Mark Hornick, Senior Director, Product Management, Data Science and Machine Learning    
Mark Hornick is the Senior Director of Product Management for the Oracle Machine Learning (OML) family of products. He leads the OML PM team and works closely with Product Development on product strategy, positioning, and evangelization, Mark has over 20 years of experience with integrating and leveraging machine learning with Oracle technologies, working with internal and external customers in the application of Oracle’s machine learning technologies for scalable and deployable data science projects. Mark is Oracle’s representative on the R Consortium’s Board of Directors, an Oracle Adviser and founding member of the Business Intelligence Warehousing and Analytics (BIWA) User Community, and Content Selection Committee Chair for the Analytics and Data Summits.