Apr 8, 08:55 PM

Deep Cogito Unveils Open-Source AI Models with Hybrid Architecture and Iterative Self-Improvement

Deep Cogito, a San Francisco-based AI startup, has launched a lineup of open-source language models under the Cogito v1 series, ranging in size from 3 billion to 70 billion parameters. These models reportedly outperform comparably-sized open-source alternatives, including Meta's LLaMA and Alibaba's Qwen models, and use a hybrid architecture that supports both standard and reasoning modes. The models employ a novel training approach called Iterated Distillation and Amplification (IDA), which enables iterative self-improvement by refining reasoning capabilities and distilling them back into the model. This technique has demonstrated strong performance in reasoning mode, particularly in tasks requiring advanced reasoning and multilingual capabilities. Deep Cogito's models are available under open licenses and can be accessed through platforms such as Hugging Face, Together AI, Ollama, and Fireworks AI. The company also plans to release larger models, scaling up to 671 billion parameters, in the coming months. In addition to general-purpose language tasks, the company introduced DeepCoder-14B, a coding-focused model that outperforms other models of similar size, such as o3-mini, in coding-related benchmarks. Developed in collaboration with Agentica, the dataset, training recipes, and code for DeepCoder-14B have been fully open-sourced. The models also feature advanced tool-calling capabilities, further enhancing their utility for coding and agent-based tasks.