Replace Human Feedback with AI Feedback? ๐ค @GoogleDeepMind compared RLHF to RLAIF on the task of text summarization to evaluate if AI-generated feedback (RLAIF) can replace Human Feedback (RLHF) for LLM alignment. ๐ก ๐งถ https://t.co/V1ol0PJxO4
— Philipp Schmid (@_philschmid) Sep 9, 2023
from Twitter https://twitter.com/_philschmid
September 09, 2023 at 12:12AM
via IFTTT
No comments:
Post a Comment