📣 Announcing a unified AI platform connecting developers to thousands of GPUs worldwide: NVIDIA DGX Cloud Lepton (Early Access). Build, train, and deploy AI apps at scale—faster and easier than ever. Learn more & join for early access: https://t.co/Ij5MnWrSDF https://t.co/LWqitnCxEN
Get to know the #NVIDIARTXPRO 6000 Blackwell Max-Q Workstation Edition: ⚡The latest RT, Tensor, and CUDA Cores ⚡96GB of ultra-fast DDR7 memory ⚡Unparalleled performance for data, simulation, and AI applications Learn more: https://t.co/Q1UlfDMROf https://t.co/tIgmT0tXjw
Faster, cheaper easier @GoogleDeepMind Gemma 3 deployments on Vertex AI. We worked with the Cloud team to add more optimized configurations to deploy Gemma 3 using @vllm_project and @sgl_project! 🎉 https://t.co/abKPAf2wlY
Nvidia has entered the cloud computing market with the launch of a new cloud service that allows AI developers to rent server chips directly, positioning itself as a competitor to established cloud providers. The company has also announced several hardware and software advancements, including the NVIDIA TensorRT, which enhances the performance of Stable Diffusion 3.5 on GeForce RTX and RTX PRO GPUs. Nvidia's H200 NVL GPU, featuring 141GB of HBM3e memory and a PCIe interface, is designed for efficient, dense server deployments with low inference latency. Additionally, Nvidia introduced the RTX PRO 6000 Blackwell Max-Q workstation edition, equipped with the latest RT, Tensor, and CUDA cores and 96GB of DDR7 memory, targeting data, simulation, and AI workloads. Complementing these hardware developments, Nvidia launched the DGX Cloud Lepton platform in early access, providing a unified AI platform that connects developers to thousands of GPUs globally for building, training, and deploying AI applications. Meanwhile, Google Cloud has integrated Nvidia technology into its offerings, including G4 VMs powered by NVIDIA RTX PRO 6000 GPUs, delivering four times the power for AI, graphics, and simulations. Google Cloud's Cloud Run now supports NVIDIA L4 GPUs for serverless AI inference with pay-per-second billing and rapid cold start times. Google DeepMind has also optimized Gemma 3 deployments on Vertex AI, enhancing AI model deployment efficiency.