Thoughts: Favorite tweets

Monday, February 19, 2024

Favorite tweets

Until now, HELM has evaluated LMs with on short responses, where evaluation is simple. We now introduce HELM Instruct, which evaluates open-ended instruction following. We evaluate 4 models on 7 scenarios using 4 evaluators against 5 criteria: https://t.co/5dyRHDjlrF
— Percy Liang (@percyliang) Feb 19, 2024

from Twitter https://twitter.com/percyliang

February 19, 2024 at 05:36PM
via IFTTT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)