May 26, 07:24 PM

Enigma Project Sets New NanoGPT Training Speed Record: 3.28 FineWeb Val Loss in 2.979 Minutes Using Overlapped Gradient Communication on 8xH100 GPUs

The Enigma Project team, led by Konstantin Wille, has set new records in NanoGPT training speed using 8xNVIDIA H100 GPUs. The latest record achieved a FineWeb validation loss of 3.28 in 2.979 minutes, surpassing the previous 2.990-minute record by 0.7 seconds. This improvement was enabled by overlapping gradient communication with computation. Earlier, the team had improved the speed from 3.014 minutes to 2.990 minutes through accelerated gradient all-reduce techniques. Concurrently, researchers including Michal Takac have reported substantial algorithmic optimizations—achieving 70-80% improvements on single RTX4090 GPUs—by integrating PhysicsML and SciML concepts into large language models without altering hyperparameters or CUDA kernel configurations. These advancements highlight ongoing progress in thermodynamic computing and transformer model efficiency, with a focus on scaling NanoGPT training performance.

#Enigma Project #Konstantin Wille #FineWeb #Michal Takac #PhysicsML #SciML

Written with ChatGPT (GPT-4).