I wrote a short overview of DeepSeek R1's training process: https://t.co/OLsz2u1fwJ Will follow-up with the knowledge distillation later this week. https://t.co/HUQfBISwbo
— Andriy Burkov (@burkov) Jan 27, 2025
from Twitter https://twitter.com/burkov
January 27, 2025 at 11:56PM
via IFTTT
No comments:
Post a Comment