Thoughts: Favorite tweets

Sunday, July 19, 2020

Favorite tweets

The transformer architecture of GPT upper bounds its ability at memorization. It cannot learn many algorithms due to the functional form of its forward pass, and spends a fixed compute per token - i.e. it can't "think for a while". Progress here critical, likely but non-trivial. pic.twitter.com/ULJdf35MJU
— Andrej Karpathy (@karpathy) July 19, 2020

from Twitter https://twitter.com/karpathy

July 19, 2020 at 12:10PM
via IFTTT

No comments:

Post a Comment