Progress in Algorithmic Performance of LLMs Has Been Phenomenal! The compute required to reach a set performance threshold has halved approximately every 8 months over the last few years and is substantially faster than hardware gains per Moore's Law It will get cheaper over…
Great paper with concrete data points for where, and by how much, algorithmic advancements vs compute/scale have contributed to the progress in the transformer architecture. Even modest algorithmic efficiency gains could have a significant impact on the state of the AI… https://t.co/ChPxCLJlhs
Here's a good estimate of how fast the capabilities of LLMs have been growing: several times as fast as Moore's Law! The compute needed to achieve the same outcome halving every 5 to 14 months, with no sign of slowing. Most gains are from bigger scale. https://t.co/V1X5BqMf2U https://t.co/6jqp8yNS93

Recent studies indicate that algorithmic progress in language models has been advancing rapidly, with the compute needed to achieve a set performance level halving every 5 to 14 months on average. This outpaces Moore's Law, with transformers leading efficiency gains. The capabilities of Large Language Models (LLMs) have been growing several times faster than Moore's Law, with no signs of slowing down. Algorithmic advancements and compute scaling have significantly contributed to the progress in the transformer architecture, with even modest efficiency gains having a substantial impact on AI development.
