A recent report by Epoch AI suggests that scaling AI models up to 10,000 times the capability of today's models by 2030 is technically possible but not certain. The report identifies four major constraints to this scaling: power, chips, data, and latency. OpenAI has ambitious plans to surpass Google's infrastructure through multi-datacenter training, utilizing gigawatt clusters, telecom networking, long haul fiber, and hierarchical and asynchronous SGD. The feasibility of achieving 2e29 FLOP training runs by 2030 is also highlighted. The high costs of scaling are a significant consideration.
How rapidly and how far up can we scale AI models? https://t.co/IYBw2J7v5O
Multi-Datacenter Training OpenAI's Ambitious Plan To Beat Google's Infrastructure Gigawatt Clusters, Telecom Networking, Long Haul Fiber, Hierarchical & Asynchronous SGD, Distributed Infrastructure Winners, Silent Data Corruption, Stragglers https://t.co/unluurzsnd
Can AI Scaling Continue Through 2030? Epoch AI: We investigate the scalability of AI training runs. We identify electric power, chip manufacturing, data and latency as constraints. We conclude that 2e29 FLOP training runs will likely be feasible by 2030. https://t.co/yR4dYaJZHp