Thoughts: Favorite tweets

Tuesday, September 26, 2023

Favorite tweets

Recently vLLM added AWQ (4bit) support... this open source combination will consume less memory and run faster inference of llama based models like 70B llama-2 or 34B codellama. I also saw better model quality using AWQ over other quantization methods like GPTQ https://t.co/F7NALZUR86
— anton (@abacaj) Sep 26, 2023

from Twitter https://twitter.com/abacaj

September 26, 2023 at 01:31PM
via IFTTT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)