
Google DeepMind's open-source language model Gemma is gaining popularity among developers for its speed and efficiency. Gemma models are being tested locally and on devices like iPhones, achieving impressive throughput rates. Gemma's release has positioned Google as a key player in the open-source LLM domain, offering faster performance compared to other models like PyTorch.
Sneak peak, hey, pssst... @ollama 0.1.27 is out! Performance improvements (up to 2x) when running Gemma models Let's try it! https://t.co/hy050gpYl4
A major advantage of using Keras 3 with the JAX backend: it's fast. And it's fast out of the box, without any need for careful performance optimization. Gemma 2B inference for a single prompt runs at 116 token/s on a V100, a 3.3x increase over the HF/PT implementation. https://t.co/bMcqZCMbYF
This is how easy it is to run a model locally. 2 commands in the terminal to get Gemma running. 👏 @ollama It's awesome to see more open source models being released. Thank you @sundarpichai @demishassabis @JeffDean @ZoubinGhahrama1 @OriolVinyalsML @asoroken @alexanderchen… https://t.co/qozcd6IJ1g




