Jun 4, 06:55 AM

NVIDIA's 8B Llama Nemotron Nano VL Tops OCRBench V2 Running on Single GPU; Parakeet-TDT-0.6B-v2 Leads ASR Leaderboard

NVIDIA has launched the Llama Nemotron Nano VL, an 8-billion parameter vision-language model optimized for advanced document understanding tasks. This model has achieved the top position on the OCRBench V2 leaderboard, a comprehensive bilingual benchmark featuring four times more tasks than previous versions. The Llama Nemotron Nano VL excels in extracting diverse information from complex documents, including tables, charts, diagrams, and video frames, all while operating efficiently on a single GPU. Additionally, NVIDIA's Parakeet-TDT-0.6B-v2 speech AI model currently ranks first on the Hugging Face ASR leaderboard, highlighting the company's advancements in both document processing and speech recognition technologies.

#NVIDIA #Llama Nemotron Nano VL #OCRBench V2 #Hugging Face ASR

Written with ChatGPT (GPT-4).

Sources

Additional media

Image #1 for story nvidia-s-8b-llama-nemotron-nano-vl-tops-ocrbench-v2-running-on-single-gpu-tdt-0-aac55dbb

Image #2 for story nvidia-s-8b-llama-nemotron-nano-vl-tops-ocrbench-v2-running-on-single-gpu-tdt-0-aac55dbb

Image #3 for story nvidia-s-8b-llama-nemotron-nano-vl-tops-ocrbench-v2-running-on-single-gpu-tdt-0-aac55dbb

NVIDIA's 8B Llama Nemotron Nano VL Tops OCRBench V2 Running on Single GPU; Parakeet-TDT-0.6B-v2 Leads ASR Leaderboard

Sources

Additional media

Similar Stories