Several new artificial intelligence models have been announced, showcasing advancements in natural language processing and vision-language capabilities. LocalAI introduced multiple models including '72b-qwen2.5-kunou-v1', a successor to L3-70B-Euryale-v2.2, and 'deepthought-8b-llama-v0.01-alpha', which is built on LLaMA-3.1 with 8 billion parameters. Additionally, the 'l3.3-70b-euryale-v2.3' model has been released as a direct replacement for Euryale v2.2. SiliconFlowAI launched 'DeepSeek-V2.5-1210', an enhanced model for math, coding, and writing, now available on SiliconCloud. The new 'Llama-3.2-3B' model has joined the FlowerTune LLM leaderboard, while 'Llama 3.3', a 70 billion parameter model, offers comparable performance to larger models at a reduced cost. Cohere introduced 'Command R7B', the smallest and fastest model in their R series, designed for efficiency and lower deployment costs. DeepSeek also launched 'DeepSeek-VL2', a vision-language model available in sizes of 1.0B, 2.8B, and 4.5B parameters, utilizing a mixture of experts architecture for improved performance across various tasks. These developments reflect ongoing innovation in AI, with models increasingly focused on efficiency and versatility in applications.
The new Cohere architecture is VERY interesting - 3 layers of ROPE w/ sliding window attn and then a global attn layer with NO positional encoding at all.
cohere2 coming to mlx https://t.co/4N6BfCWKrS
cohere command-r7b-12-2024 is now available in anychat try it out https://t.co/pcAtTgjMNv