Skip to Main Content

Oracle Machine Learning Office Hours

Free tips and training every month! Subscribe for reminders and more from Office Hours. FAQ

Header container

February 18

17:00 UTC   Start Times Around the World

Subscribe to be notified of changes to sessions and give us feedback!

Having trouble watching the video on this page? Open the video in your browser.

Description

Oracle Machine Learning for Spark
We saw how Oracle Machine Learning for Spark offers interfaces to run Machine Learning algorithms on top of Data Lakes, using Spark to distribute computation across Nodes, and brings integration with the Big Data ecosystem that allows for manipulation tables in HIVE and Impala, as well as integration with HDFS and the Oracle Database, using the R language as front-end.

It makes the open source R scripting language and environment ready for the enterprise and big data. Designed for problems involving both large and small volumes of data, Oracle Machine Learning for Spark integrates R with Data Lakes, allowing users to execute R commands and scripts for data processing, statistical and machine learning analytics on HIVE, IMPALA, Spark DataFrame tables and views using R and Spark SQL syntax. Many familiar R functions are overloaded and translate R functions into SQL for in-Data Lake execution.

Oracle Machine Learning consists of complementary components supporting scalable machine learning algorithms for in-database and big data environments (including Cloud and on-premises), notebook technology, SQL, Python and R APIs, and Hadoop/Spark environments.

The Slides used in the presentation can be found in the Resources section below.

Video highlights:
04:50 Introduction to Oracle Machine Learning for Spark
07:10 Oracle Machine Learning for Spark integration
09:56 OML4Spark R language API
11:40 OML4Spark performance benchmark
13:55 OML4Spark benefits for Spark MLlib on users on R
17:20 Demo - Manipulating HDFS data
22:00 Demo - Manipulating HIVE, IMPALA and Spark DataFrames
36:48 Demo - Using OML4Spark ML models to predict Bike Demand
43:45 Demo - OML4Spark Cross-Validation and Classification Model Selection
47:54 Demo - Benchmark of OML4Spark GLM Logistic on 100mi records
49:26 - OML4Spark Roadmap
51:09 - Q&A

Your Experts

Marcos Arancibia
Marcos Arancibia, Senior Principal Product Manager, Machine Learning    
Marcos Arancibia is the Product Manager for Oracle Machine Learning, working with Machine Learning in the Oracle Database and on Spark. He develops product strategy, roadmap prioritization, product positioning and product evangelization, helping define the product roadmap for Oracle Machine Learning. Before joining Oracle in 2010 he spent 13 years at SAS Institute Inc., from Country Manager in LAD to Regional Data Mining lead in the US. He holds a bachelor's degree with additional courses in the master's degree, both in Statistics from UNICAMP in Brazil. He has Certifications from Stanford on AI and Machine Learning, and from the University of Washington on Computational Neuroscience.
Mark Hornick
Mark Hornick, Senior Director, Product Management, Data Science and Machine Learning    
Mark Hornick is the Senior Director of Product Management for the Oracle Machine Learning (OML) family of products. He leads the OML PM team and works closely with Product Development on product strategy, positioning, and evangelization, Mark has over 20 years of experience with integrating and leveraging machine learning with Oracle technologies, working with internal and external customers in the application of Oracle’s machine learning technologies for scalable and deployable data science projects. Mark is Oracle’s representative on the R Consortium’s Board of Directors, an Oracle Adviser and founding member of the Business Intelligence Warehousing and Analytics (BIWA) User Community, and Content Selection Committee Chair for the Analytics and Data Summits.

All Sessions

November 30 2021 16:00:00 UTCWeekly Office Hours: OML on Autonomous Database - Ask & Learn
November 23 2021 16:00:00 UTCWeekly Office Hours: OML on Autonomous Database - Ask & Learn
November 16 2021 16:00:00 UTCWeekly Office Hours: OML on Autonomous Database - Ask & Learn
November 9 2021 16:00:00 UTCOML Usage Highlight: Leveraging OML algorithms in Retail Science platform - Fraud Detection
October 12 2021OML feature highlight: Time Series analysis with Oracle Machine Learning
October 5 2021OML4Py features: Using third-party Python packages from Python, SQL and REST
September 21 2021OML usage highlight: Live Demo of Oracle Stream Analytics with OML AutoML UI and OML Services
August 17 2021OML Usage Highlight: ML on SailGP data: Predicting the best sailing direction
August 10 2021OML feature highlight: Deploy an XGBoost Model using OML Services
August 3 2021ML Concepts - Using Cross-Validation with OML in-Database and with Embedded Python Execution
June 29 2021Weekly Office Hours: OML on Autonomous Database - Ask & Learn
June 22 2021ML Concepts - Encoding of Categorical Attributes: OneHot vs Mean vs WoE and when to use them
June 15 2021OML usage highlight: Machine Learning Recommendations for Maintenance and Repair
May 25 2021Hands-On Lab using Oracle Machine Learning AutoML UI on Autonomous Database
May 18 2021Hands-On Lab using Oracle Machine Learning Services on Autonomous Database
May 11 2021OML usage highlight: Oracle Process Automation with Real-time OML Services scoring
April 20 2021OML usage highlight: Oracle Stream Analytics with Real-time OML Services scoring
April 13 2021OML usage highlight: Making Oracle Digital Assistant smarter with OML Services
March 30 2021OML feature highlight: OML AutoML UI for Automated Model Building
March 23 2021Weekly Office Hours: OML on Autonomous Database - Ask & Learn