New fully open source GPT-3 architecture models were just released. In 7 sizes, 13B being the largest. The training seems to follow compute optimal chinchilla formula of 20 tokens per model parameter https://t.co/LyGPY51W5L https://t.co/61mnerHHH6
— anton (@abacaj) Mar 28, 2023
from Twitter https://twitter.com/abacaj
March 28, 2023 at 11:54AM
via IFTTT
No comments:
Post a Comment