Thoughts: Favorite tweets

Monday, October 27, 2025

Favorite tweets

Combining the benefits of RL and SFT with on-policy distillation, a promising approach for training small models for domain performance and continual learning. https://t.co/u5tcvw1BG5
— Mira Murati (@miramurati) Oct 27, 2025

from Twitter https://twitter.com/miramurati

October 27, 2025 at 05:06PM
via IFTTT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)