By now you may have seen some hubbub about @MosaicML’s MPT-7B series of models: MPT-7B base, MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+. These models were pretrained on the same 1T token data mix. In this 🧵I break down the decisions behind our pretraining data mix https://t.co/ifoWjqUU4S
— Matthew Leavitt (@leavittron) May 12, 2023
from Twitter https://twitter.com/leavittron
May 12, 2023 at 11:51AM
via IFTTT
No comments:
Post a Comment