Jun 12, 04:00 PM

Nvidia Launches DGX Cloud Lepton, New GPUs Including H200 NVL and RTX PRO 6000, Boosts AI with TensorRT; Google Cloud Integrates Nvidia Tech

Nvidia has entered the cloud computing market with the launch of a new cloud service that allows AI developers to rent server chips directly, positioning itself as a competitor to established cloud providers. The company has also announced several hardware and software advancements, including the NVIDIA TensorRT, which enhances the performance of Stable Diffusion 3.5 on GeForce RTX and RTX PRO GPUs. Nvidia's H200 NVL GPU, featuring 141GB of HBM3e memory and a PCIe interface, is designed for efficient, dense server deployments with low inference latency. Additionally, Nvidia introduced the RTX PRO 6000 Blackwell Max-Q workstation edition, equipped with the latest RT, Tensor, and CUDA cores and 96GB of DDR7 memory, targeting data, simulation, and AI workloads. Complementing these hardware developments, Nvidia launched the DGX Cloud Lepton platform in early access, providing a unified AI platform that connects developers to thousands of GPUs globally for building, training, and deploying AI applications. Meanwhile, Google Cloud has integrated Nvidia technology into its offerings, including G4 VMs powered by NVIDIA RTX PRO 6000 GPUs, delivering four times the power for AI, graphics, and simulations. Google Cloud's Cloud Run now supports NVIDIA L4 GPUs for serverless AI inference with pay-per-second billing and rapid cold start times. Google DeepMind has also optimized Gemma 3 deployments on Vertex AI, enhancing AI model deployment efficiency.