Apr 9, 06:00 PM

GoogleDeepMind Updates Gemma v1.1, Launches CodeGemma for Coding, RecurrentGemma for Fast Inference

GoogleDeepMind has announced the release of Gemma v1.1, including updates to its 2B and "7B" IT models, led by Rob Dadashi's strike team, addressing issues identified by the open source community. The release features improvements in instruction following, factuality, reasoning, and introduces new Reinforcement Learning algorithms with open weights. Additionally, Google has expanded the Gemma family with the introduction of CodeGemma, a collection of models specialized in coding tasks, and RecurrentGemma, a new model based on the Griffin architecture for fast inference on long sequences. The CodeGemma models, available in 2B and 7B sizes with an 8192k context, are optimized for code generation, infilling, and instruction following, boasting capabilities such as fill-in-the-middle code completion and compatibility with torch.compile(). These models, trained on 500B additional tokens with a mix of 80% code and 20% natural language, outperform Codellama 13B and are ~1.5x faster than similar models, with commercial use allowed. RecurrentGemma, a 2B model, replaces the transformer with a mix of gated linear recurrences and local attention, aiming for competitive performance with high throughput.