Thoughts: Favorite tweets

Wednesday, September 13, 2023

Favorite tweets

Efficient Memory Management for Large Language Model Serving with PagedAttention Improves the throughput of popular LLMs by 2-4x with the same level of latency compared to the SotA systems. repo: https://t.co/wJwTJyG1vh abs: https://t.co/PfWAjvX2zn https://t.co/aBIOuC8rLY
— Aran Komatsuzaki (@arankomatsuzaki) Sep 13, 2023

from Twitter https://twitter.com/arankomatsuzaki

September 12, 2023 at 05:55PM
via IFTTT

No comments:

Post a Comment