Skip to Main Content

Oracle Machine Learning Office Hours

Free tips and training every month! Subscribe for reminders and more from Office Hours. FAQ

Header container

September 14

15:00 UTC   Start Times Around the World

Subscribe to be notified of changes to sessions and give us feedback!

Having trouble watching the video on this page? Open the video in your browser.

Description

ML Concepts - Best Practices when using ML Classification Metrics
On this weekly Office Hours for Oracle Machine Learning on Autonomous Database, Jie Liu, Data Scientist for Oracle Machine Learning, covered the best practices when utilizing ML Classification Metrics, and showed a variety of ways to use them with Oracle Machine Learning for Python (OML4Py), with a live demo.

The Oracle Machine Learning product family supports data scientists, analysts, developers, and IT to achieve data science project goals faster while taking full advantage of the Oracle platform.

The Oracle Machine Learning Notebooks offers an easy-to-use, interactive, multi-user, collaborative interface based on Apache Zeppelin notebook technology, and support SQL, PL/SQL, Python and Markdown interpreters. It is available on all Autonomous Database versions and Tiers, including the always-free editions.

OML includes AutoML, which provides automated machine learning algorithm features for algorithm selection, feature selection and model tuning, in addition to a specialized AutoML UI exclusive to the Autonomous Database.

OML Services is also included in Autonomous Database, where you can deploy and manage native in-database OML models as well as ONNX ML models (for classification and regression) built using third-party engines, and can also invoke cognitive text analytics.

Video highlights:
00:48 Topics for today
01:01 Upcoming Sessions
02:24 Classification Metrics: Agenda Outline
03:06 Classification Metrics: Motivation
05:12 Accuracy
06:44 Probability Score Histogram
09:33 Confusion Matrix
11:12 Precision and Recall - Definition
13:15 Precision and Recall - Trade Offs
17:53 F1 Score - Combining Precision and Recall
18:53 Precision-Recall Curve
20:14 Lift Chart
21:45 Waterfall Analysis
22:44 Question: What's the concept behind the "Threshold"
24:16 AUC and the ROC (Receiver Operating Characteristic) curve
28:19 Computation of metrics using OML4Py and SQL
30:58 Demo
33:08 Question: Which ML algorithm was used and why?
42:01 Question: Can I use AUC value instead of Accuracy?
47:04 Demo: Showing AUC is not sensitive to target imbalance
51:40 Q&A

Your Experts

  • #SELECTION#
    Jie Liu

    Jie Liu

    Jie Liu is a data scientist. He works with Oracle Machine Learning Product Management team to develop marketing content for OML products and deliver data science solutions for customers inside and outside Oracle. Before joining Oracle, he was a data scientist in Epsilon developing machine learning driven real time bidding strategy and application for online advertisement. He obtained his PhD in Electrical Engineering from University of Notre Dame.
    #MISC#
    #ACTIONS#
  • #SELECTION#
    Marcos Arancibia

    Marcos Arancibia   

    Marcos Arancibia is the Product Manager for Oracle Machine Learning, working with Machine Learning in the Oracle Database and on Spark. He develops product strategy, roadmap prioritization, product positioning and product evangelization, helping define the product roadmap for Oracle Machine Learning. Before joining Oracle in 2010 he spent 13 years at SAS Institute Inc., from Country Manager in LAD to Regional Data Mining lead in the US. He holds a bachelor's degree with additional courses in the master's degree, both in Statistics from UNICAMP in Brazil. He has Certifications from Stanford on AI and Machine Learning, and from the University of Washington on Computational Neuroscience.
    #MISC#
    #ACTIONS#
  • #SELECTION#
    Mark Hornick

    Mark Hornick   

    Mark Hornick is the Senior Director of Product Management for the Oracle Machine Learning (OML) family of products. He leads the OML PM team and works closely with Product Development on product strategy, positioning, and evangelization, Mark has over 20 years of experience with integrating and leveraging machine learning with Oracle technologies, working with internal and external customers in the application of Oracle’s machine learning technologies for scalable and deployable data science projects. Mark is Oracle’s representative on the R Consortium’s Board of Directors, an Oracle Adviser and founding member of the Business Intelligence Warehousing and Analytics (BIWA) User Community, and Content Selection Committee Chair for the Analytics and Data Summits.
    #MISC#
    #ACTIONS#

All Sessions