More powerful deep learning with transformers (Ep. 84) - a podcast by Francesco Gadaleta

from 2019-10-27T07:59:59

:: ::

Some of the most powerful NLP models like BERT and GPT-2 have one thing in common: they all use the transformer architecture.
Such architecture is built on top of another important concept already known to the community: self-attention.
In this episode I explain what these mechanisms are, how they work and why they are so powerful.