ABD403: Best Practices for Distributed Machine Learning and Predictive Analytics Using Amazon EMR and Open-Source Tools - a podcast by AWS

from 2021-01-31T22:10:42.023393

:: ::

This session, we focus on common use cases and design patterns for predictive analytics using Amazon EMR. We address accessing data from a data lake, extraction and preprocessing with Apache Spark, analytics and machine learning code development with notebooks (Jupyter, Zeppelin), and data visualization using Amazon QuickSight. We cover other operational topics, such as deployment patterns for ad hoc exploration and batch workloads using Spot and multi-user notebooks. The intended audience for this session includes technical users who are building statistical and data analytics models for the business using tools, such as Python, R, Spark, Presto, Amazon EMR, Notebooks.

Further episodes of AWS re:Invent 2017

Further podcasts by AWS

Website of AWS