New @NVIDIA's NVILA vision language model can handle large images and long videos without slowing down. To improve accuracy and efficiency, it uses a "scale-then-compress" strategy: - Scaling up: NVILA increase the resolution of images and videos to capture details. -… https://t.co/AGZjyhfsTD https://t.co/wIzrfQjtpw
— TuringPost (@TheTuringPost) Dec 8, 2024
from Twitter https://twitter.com/TheTuringPost
December 08, 2024 at 12:36AM
via IFTTT
No comments:
Post a Comment