The core concept of knowledge distillation from Gemma-2 paper. Knowledge distillation is becoming very crucial. This week we saw the great result from @nvidia - that a 4B Llama 3.1 model from Llama 3.1 8B combining weight pruning and knowledge distillation 🧬 Knowledge… https://t.co/CDx5pLgTNx https://t.co/l1hYpr11fm
— Rohan Paul (@rohanpaul_ai) Aug 17, 2024
from Twitter https://twitter.com/rohanpaul_ai
August 17, 2024 at 03:47AM
via IFTTT
No comments:
Post a Comment