Thoughts: Favorite tweets

Tuesday, August 16, 2022

Favorite tweets

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale Develops a procedure for Int8 matmul for feed-forward and attention projection layers in transformers, which halves the memory for inference while retaining fp32 performance. https://t.co/D5qgJiycI7 https://t.co/6YIWqwPXkN
— Aran Komatsuzaki (@arankomatsuzaki) Aug 16, 2022

from Twitter https://twitter.com/arankomatsuzaki

August 15, 2022 at 05:50PM
via IFTTT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)