OpenAI has successfully deployed the first full 8-rack GB200 NVL72 system on Microsoft Azure, marking a significant advancement in its computational capabilities. This new system is reported to be 30 times faster for large language model (LLM) inference compared to the NVIDIA H100 Tensor Core GPU, and it offers four times the speed for LLM training. Additionally, the GB200 NVL72 boasts energy efficiency that is 25 times better than the H100 and provides data processing speeds that are 18 times faster. This deployment is expected to enhance OpenAI's production workloads significantly.
Microsoft is running a full 8-rack GB200 NVL72 in Azure for OpenAI. GB200 NVL72 is absolute beast compare to H100. - LLM Inference: 30× faster vs. NVIDIA H100 Tensor Core GPU -LLM Training: 4× faster vs. H100 -Energy Efficiency: 25× better vs. H100 - Data Processing: 18× faster… https://t.co/FEac3ay5aR
Nice, 8-rack GB200 NVL72 is running production workloads for OpenAI. $NVDA $MSFT https://t.co/lIUSRzyjJb
Sam Altman just said: “first full 8-rack GB200 NVL72 now running in azure for openai” https://t.co/cW4CwHzpvl https://t.co/XN48agSEOv