Thoughts: Favorite tweets

Tuesday, February 4, 2025

Favorite tweets

Wow, someone just released a notebook to train a reasoning LLM with the new RL algorithm from DeepSeek, GRPO. In <2 hours, you can transform a very small model, Qwen 0.5 (500 million parameters) into a tiny math reasoning machine. https://t.co/Su0cJ6kw9H
— Lior⚡ (@LiorOnAI) Feb 4, 2025

from Twitter https://twitter.com/LiorOnAI

February 04, 2025 at 06:54PM
via IFTTT

No comments:

Post a Comment