RLHF is powerful; it lets us fine-tune LLMs to be more useful. What if we could do RLHF… without fine-tuning??? Excited to share Emulated Fine-Tuning (EFT)! EFT lets us “emulate” what we would have gotten if we did RLHF on a new model, without actually doing the RLHF! https://t.co/wRQU8BKjax
— Eric (@ericmitchellai) Oct 27, 2023
from Twitter https://twitter.com/ericmitchellai
October 27, 2023 at 11:01AM
via IFTTT
No comments:
Post a Comment