Thoughts: Favorite tweets

Friday, November 3, 2023

Favorite tweets

Introducing DeepSpeed-FastGen 🚀 Serve LLMs and generative AI models with - 2.3x higher throughput - 2x lower average latency - 4x lower tail latency w. Dynamic SplitFuse batching Auto TP, load balancing w. perfect linear scaling, plus easy-to-use API https://t.co/iizM71bjqj https://t.co/x2mDwzBJK7
— DeepSpeed (@MSFTDeepSpeed) Nov 3, 2023

from Twitter https://twitter.com/MSFTDeepSpeed

November 03, 2023 at 04:51PM
via IFTTT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)