The transformer architecture was introduced in the landmark 2017 machine learning paper Attention is All You Need. Previously, many researchers believed that the attention mechanism was among the most promising research directions for improving sequence-to-sequence models. Writing in 2015, Christopher Olah remarked,
LSTMs were a big step in what we can accomplish with RNNs. It’s natural to wonder: is there another big step? A common opinion among researchers is: “Yes! There is a next step and it’s attention!”
This prediction turned out to be correct. Transformers are generally considered to have unseated LSTM at competitive language modeling, and their central operating principle is using the attention mechanism.
Indicator | Value |
---|---|
Stars | ★★★☆☆ |
Platform | Metaculus |
Number of forecasts | 163 |
The transformer architecture was introduced in the landmark 2017 machine learning paper Attention is All You Need. Previously, many researchers believed that the attention mechanism was among the most promising research directions for improving...
<iframe src="https://metaforecast.org/questions/embed/metaculus-4892" height="600" width="600" frameborder="0" />