NVLM-D-72B
by NVIDIANVLM-D-72B is NVIDIA's frontier-class multimodal large language model that achieves state-of-the-art results on vision-language tasks. The model features a decoder-only architecture and demonstrates exceptional performance on both multimodal and text-only benchmarks.
Specifications
- Context Window
- 32,768 tokens
- Released
- September 2024
Capabilities
Vision-Language UnderstandingImage AnalysisVisual Question AnsweringText GenerationMultimodal Reasoning
Best For
Rate this model
4.2(6 ratings)
Click to rate this AI model