Thoughts: Favorite tweets

Thursday, June 26, 2025

Favorite tweets

Wrote an intro to evals for long-context Q&A systems: • How it differs from basic Q&A • What dimensions & metrics to eval on • How to build llm-evaluators • How to build eval datasets • Benchmarks: narratives, technical docs, multi-docs https://t.co/XAzPcG7tvf
— Eugene Yan (@eugeneyan) Jun 25, 2025

from Twitter https://twitter.com/eugeneyan

June 25, 2025 at 01:08AM
via IFTTT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)