AI 代码模型昨晚扎堆开源,另一个是 Cogito v1 Preview 包含 3B、8B、14B、32B 和 70B 规格 70B 模型的表现超越新发布的 Llama 4 109B MoE 模型 针对编码、函数调用和 Agents 用例进行了优化 每个模型都可以在标准模式和推理模式下运行 https://t.co/WuOdUFvIze https://t.co/19oMMoeues
DeepCoder-14B-Preview 一个完全开源的代码模型 他们的测试结果说在代码能力上跟 o3-mini 相当 数据集、代码、训练日志都已经开源 目前可以在 Together AI 试用 https://t.co/ABSymNkWin https://t.co/ZsfCrbjnBI
🎉 Congratulations to the @DeepCogito team on an incredible launch! Today, they introduced Deep Cogito, a new company on a bold mission to build general superintelligence. They have also released a suite of state-of-the-art open LLMs ranging from 3B to 70B. We’re thrilled to https://t.co/m2LkeAVj2s
Deep Cogito, a San Francisco-based AI startup, has launched a lineup of open-source language models under the Cogito v1 series, ranging in size from 3 billion to 70 billion parameters. These models reportedly outperform comparably-sized open-source alternatives, including Meta's LLaMA and Alibaba's Qwen models, and use a hybrid architecture that supports both standard and reasoning modes. The models employ a novel training approach called Iterated Distillation and Amplification (IDA), which enables iterative self-improvement by refining reasoning capabilities and distilling them back into the model. This technique has demonstrated strong performance in reasoning mode, particularly in tasks requiring advanced reasoning and multilingual capabilities. Deep Cogito's models are available under open licenses and can be accessed through platforms such as Hugging Face, Together AI, Ollama, and Fireworks AI. The company also plans to release larger models, scaling up to 671 billion parameters, in the coming months. In addition to general-purpose language tasks, the company introduced DeepCoder-14B, a coding-focused model that outperforms other models of similar size, such as o3-mini, in coding-related benchmarks. Developed in collaboration with Agentica, the dataset, training recipes, and code for DeepCoder-14B have been fully open-sourced. The models also feature advanced tool-calling capabilities, further enhancing their utility for coding and agent-based tasks.