Google has announced the general availability of Gemini 2.0 Flash-Lite, a new model designed for improved performance in various AI applications, including voice AI, video editing, and data analytics. Developers can begin utilizing the model through Google AI Studio and Google Cloud's Vertex AI. The pricing structure for Gemini 2.0 Flash-Lite is set at $0.075 per 1 million input tokens and $0.30 per 1 million output tokens, which is consistent with the costs of its predecessor, Gemini 1.5 Flash. The new model boasts enhanced performance across reasoning, multimodal, math, and factuality benchmarks, and supports a context length of up to 1 million tokens. Additionally, it offers a free tier allowing 1,500 requests per day. Users have noted its potential for applications requiring long context windows and native audio understanding, making it suitable for voice AI agents.
Google Gemini 2.0 Flash-Lite ready for production now. $0.075 / 1M input tokens and $0.30 / 1M output tokens Comparing that with GPT-4o mini Input: $0.150 / 1M tokens Output: $0.600 / 1M tokens Gemini 2.0 Flash-Lite improves over 1.5 Flash in reasoning, multimodal capabilities,… https://t.co/TxpkbEuhvb https://t.co/F6QWADbL8p
Gemini 2.0 Flash-Lite is now generally available. I’m particularly excited about native audio understanding in this small/fast/inexpensive model. We benchmarked voicemail detection using Flash-Lite — an important capability for telephone voice AI agents. Flash-Lite outperforms… https://t.co/rKb8Ufb1Od
Gemini Flash Lite 2.0 is live! 25% cheaper than Gemini Flash 2.0, same massive 1m context length, and 100% faster so far 👀 https://t.co/18eA8i5rHp