CMP324-R1: Deliver high performance ML inference with AWS Inferentia - a podcast by AWS

from 2021-01-31T22:10:42.023393

:: ::

Customers across diverse industries are defining entirely new categories of products and experiences by running intelligent applications that use ML at the core. These applications are becoming more expensive to run in production. AWS Inferentia is a custom-built machine learning inference chip designed to provide high throughput, low latency inference performance at an extremely low cost. Each chip provides hundreds of TOPS of inference throughput to allow complex models to make fast predictions. Join this session to see the latest developments using AWS Inferentia and how they can lower your inference costs in the future.

Further episodes of AWS re:Invent 2019

Further podcasts by AWS

Website of AWS