Thoughts: Favorite tweets

Sunday, July 3, 2022

Favorite tweets

DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale Reduces latency by up to 7.3x over the SotA for latency oriented scenarios and increases throughput by over 1.5x for throughput oriented scenarios. https://t.co/U7Il1KUkPl https://t.co/3wuP7pmJuX
— Aran Komatsuzaki (@arankomatsuzaki) Jul 4, 2022

from Twitter https://twitter.com/arankomatsuzaki

July 03, 2022 at 05:41PM
via IFTTT

No comments:

Post a Comment