NVLM-D-72B

by NVIDIA
API Available

NVLM-D-72B is NVIDIA's frontier-class multimodal large language model that achieves state-of-the-art results on vision-language tasks. The model features a decoder-only architecture and demonstrates exceptional performance on both multimodal and text-only benchmarks.

Specifications

Context Window
32,768 tokens
Released
September 2024

Capabilities

Vision-Language UnderstandingImage AnalysisVisual Question AnsweringText GenerationMultimodal Reasoning

Best For

Rate this model

4.2(6 ratings)

Click to rate this AI model