
The launch of StarCoder2, a state-of-the-art code-generating AI, marks a significant advancement in the field of coding and software development. Developed through a collaborative effort by BigCodeProject, ServiceNow, Hugging Face, and NVIDIA, StarCoder2 is built on The Stack v2, the largest code dataset with over 900 billion tokens. It has been trained on a 16k token context and repo-level information across more than 4 trillion tokens, supporting over 600 programming languages. StarCoder2 comes in various sizes, including 3B, 7B, and 15B, with the 15B model trained on 4.3 trillion total tokens via 4.5 epochs. This new iteration outperforms its predecessor, StarCoder1, by a significant margin and offers the best overall performance in code completion tasks. Moreover, StarCoder2 is designed to run on most GPUs, making it accessible to a broader range of developers. Its open-access nature allows developers to use GenAI to build enterprise applications more efficiently, promising powerful performance and cost optimization. StarCoder2 matches CodeLlama 33B in code completion benchmarks at twice the speed and half the cost for training and production use, and it even beats CodeLlama 34B.







StarCoder 2 Is a Code-Generating AI That Runs On Most GPUs https://t.co/OisO4sJ6DQ
ServiceNow, Hugging Face, and Nvidia expand StarCoder2 coding LLM https://t.co/dNqUPVTHdN
Nvidia, Hugging Face and ServiceNow release new StarCoder2 LLMs for code generation https://t.co/iMWD9sT9XV https://t.co/6Phn9AY854