Jul 18, 01:55 PM

Groq Launches New Tool Use Models, DeepSeek Introduces High-Performance Model

Groq has announced new state-of-the-art Tool Use models available in 8B and 70B parameters, outperforming Claude 3.5 Sonnet in function calling. The 8B model achieves a processing speed of 1050 tokens per second, while the 70B model processes 330 tokens per second. These models can be accessed on the Groq Console or downloaded from Huggingface. The 8B model has reached the #1 position on BFCL, beating all other models, including proprietary ones. Additionally, DeepSeek has introduced a new model, DeepSeek-V2-Chat, with 236B total parameters and 21B active parameters, which can run at FP16 on 8x80GB GPUs. This model shows significant improvements in performance, excelling in both arena hard and bigbench hard benchmarks.

#Groq #Tool Use #Sonnet #Groq Console #Huggingface #BFCL #DeepSeek

Written with ChatGPT (GPT-4o).

Sources

Yam Peleg@Yampeleg
2 years ago
DeepSeek just dropped a new leading model on LMSYS! Same model as DeepSeek-V2-Chat but different checkpoint (236B total params, 21B active params). You should be able to run @ FP16 on 8x80GB GPUs. Not a "home setup" but still ~half of what LLaMA-3-405B would need. https://t.co/ZKHBrXmkvN https://t.co/0vv6Mha03S
TokenBender (e/xperiments)@4evaBehindSOTA
2 years ago
deepseek having such improvements in their general model so quickly is mind-blowing. brilliant performance on both arena hard and bigbench hard. truly OSS king until llama3.1 405B replaces it (?) https://t.co/fW0geK4ein https://t.co/wqZ87p6r6u
𝑨𝒓𝒕𝒊𝒇𝒊𝒄𝒊𝒂𝒍 𝑮𝒖𝒚@artificialguybr
2 years ago
New Deepseek model and they did it again! https://t.co/bUOkwe5EY7

Additional media

Image #1 for story groq-launches-new-tool-use-models-deepseek-introduces-high-performance-model

Groq Launches New Tool Use Models, DeepSeek Introduces High-Performance Model

Sources

Additional media

Similar Stories