Thoughts: Favorite tweets

Tuesday, May 13, 2025

Favorite tweets

OpenAI introduces HealthBench, a new open-source LLM benchmark for health! Across frontier models, o3 is the best performing model with a score of 60%, followed by Grok 3 (54%) and Gemini 2.5 Pro (52%) A deeper dive: HealthBench consists of 5,000 synthetically generated https://t.co/rcQaEj3deI
— Tanishq Mathew Abraham, Ph.D. (@iScienceLuvr) May 12, 2025

from Twitter https://twitter.com/iScienceLuvr

May 12, 2025 at 07:39PM
via IFTTT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)