STG312: Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with Special Guests, Airbnb & Viber - a podcast by AWS

from 2021-01-31T22:10:42.023393

:: ::

Learn how to build a data lake for analytics in Amazon S3 and Amazon Glacier. In this session, we discuss best practices for data curation, normalization, and analysis on Amazon object storage services. We examine ways to reduce or eliminate costly extract, transform, and load (ETL) processes using query-in-place technology, such as Amazon Athena and Amazon Redshift Spectrum. We also review custom analytics integration using Apache Spark, Apache Hive, Presto, and other technologies in Amazon EMR. You'll also get a chance to hear from Airbnb & Viber about their solutions for Big Data analytics using S3 as a data lake.

Further episodes of AWS re:Invent 2017

Further podcasts by AWS

Website of AWS