Thoughts: Favorite tweets

Sunday, January 26, 2025

Favorite tweets

Lots of hot takes on whether it's possible that DeepSeek made training 45x more efficient, but @doodlestein wrote a very clear explanation of how they did it. Once someone breaks it down, it's not hard to understand. Rough summary: * Use 8 bit instead of 32 bit floating point… https://t.co/svp5bfXpbd
— Jared Friedman (@snowmaker) Jan 26, 2025

from Twitter https://twitter.com/snowmaker

January 26, 2025 at 09:31PM
via IFTTT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)