Mar 8, 08:21 PM

Google Launches New LLM API for In-Browser Use, WebGPU Boosts ML Performance

Google's MediaPipe/Core ML team launched a new Large Language Model (LLM) API for the Gemma 2.5B parameter model, which can run in browsers using WebAI. This development is part of a broader movement towards enhancing machine learning (ML) capabilities directly within web browsers, as demonstrated by other recent advancements. The 🤗 Transformers.js WebGPU Embedding Benchmark was introduced, highlighting the potential for WebGPU to significantly speed up ML models running locally in browsers. Xenovacom's announcement about soon enabling Transformers.js models to utilize a WebGPU backend further emphasizes this trend. Additionally, the recent release of ONNX Runtime with WebGPU support has been noted to make in-browser ML significantly more efficient, with benchmarks showing up to a 40x speed increase. An application developed by Nigel Gebodh allows users to interact with several LLMs, including GoogleAI's Gemma, MistralAI's Mistral, and HuggingFace's Zephyr, showcasing the practical applications of these technologies. Furthermore, Xenovacom demonstrated the capability of running depth estimation with Depth Anything in under 200ms using Transformers.js and WebGPU, indicating the practical benefits of these advancements.

#Google #Large Language Model #Gemma #WebAI #ML #WebGPU #Xenovacom #ONNX Runtime #Nigel Gebodh #LLM #GoogleAI #MistralAI #Mistral #HuggingFace #Zephyr #Depth Anything

Written with ChatGPT (GPT-4).

Sources

Additional media

Image #1 for story google-launches-new-llm-api-browser-use-webgpu-boosts-ml

Image #2 for story google-launches-new-llm-api-browser-use-webgpu-boosts-ml

Image #3 for story google-launches-new-llm-api-browser-use-webgpu-boosts-ml

Google Launches New LLM API for In-Browser Use, WebGPU Boosts ML Performance

Sources

Additional media

Similar Stories