OpenAI's advancements in large language models (LLMs) continue to make significant strides. GPT-6 is projected to have approximately 90 times the computational power of GPT-4. The cost of training models has dramatically decreased, with GPT-2 class models now trainable for less than $1,000. For instance, training a GPT-2 model today costs around $672 on an 8XH100 GPU node for 24 hours. This trend is expected to continue, with smaller models becoming more efficient and LLM inference becoming ubiquitous. OpenAI is generating about 1.2 quadrillion words annually, highlighting the massive scale of its operations. For context, Llama 3 was trained on 15 trillion tokens, which is about 120 times fewer words.
OpenAI generating about 1.2 quadrillion words every year. (🤯) (For context, Llama 3 was trained on 15 trillion tokens. About 120x less words.) https://t.co/t15EyQq2k2
Today GPT-2 class models can be trained for < $1K 🤯 We are going to see the following trends in the next 5 - Smaller models will become performant - LLMs become cheaper and cheaper train - LLM inference becomes ubiquitous We should expect to see several Sonnet 3.5 class…
In 2019, OpenAI announced GPT-2 with this post: https://t.co/tZzc5DoXEd Today (~5 years later) you can train your own for ~$672, running on one 8XH100 GPU node for 24 hours. Our latest llm.c post gives the walkthrough in some detail: https://t.co/dUnbh1LshJ Incredibly, the… https://t.co/ZI9qfpuLse