Thoughts: Favorite tweets

Monday, November 27, 2023

Favorite tweets

New RLHF 7B model: * Trained with RLAIF dataset (many sources of prompts + many models) https://t.co/wSu8l7SmHB * Released RM (fine-tune from Llama-2-chat) https://t.co/Hxvw5srIUh * Trained with different policy optimizer (APA) https://t.co/yqqMFidBKX * SOTA on MT Bench 7b (8.01)… https://t.co/MHE9OzmvDu https://t.co/b93x7W2iRM
— Nathan Lambert (@natolambert) Nov 27, 2023

from Twitter https://twitter.com/natolambert

November 27, 2023 at 04:44PM
via IFTTT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)