Prime Intellect has introduced OpenDiLoCo, an open-source implementation and scaling of DeepMind’s Distributed Low-Communication (DiLoCo) method. This new framework enables globally distributed AI model training and has been tested across three countries with 90-95% compute utilization. The OpenDiLoCo framework allows nodes to sync every 500 steps, significantly reducing the need for close proximity. The team successfully trained a 1.1 billion parameter model, three times the size of the original DeepMind work, using a hybrid code with torch FSDP and hivemind. The training was conducted with a bandwidth of less than 100mb/s. This development marks a significant step towards making decentralized AI training more accessible and efficient.
This is so awesome 👏🙏 A deep mind engineer congratulating @PrimeIntellect on reproducing their method for decentralized training Mega respect to the deep mind team. Absolute AI chads. https://t.co/atn49ySqUk
Talking Decentralized AI now. https://t.co/xTC0ocDOfo
I really wished we could have open-sourced DiLoCo. But it’s even better to see it reproduced by others. It’s crazy that distributed computing works as well despite communicating orders of magnitude less. Accelerate! 🚀