Thoughts: Favorite tweets

Thursday, March 20, 2025

Favorite tweets

This is huge. They just allowed everyone to fine-tune LLMs with RL from your browser. You can even use GRPO, the RL method of Deepseek. A fine-tuned model "outperformed OpenAI o1 and DeepSeek-R1 with a dozen labeled data points." https://t.co/z6tgze2J6J
— Lior⚡ (@LiorOnAI) Mar 19, 2025

from Twitter https://twitter.com/LiorOnAI

March 19, 2025 at 08:16PM
via IFTTT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)