Recent discussions on Twitter highlight a significant shift in the artificial intelligence landscape, particularly in the development and cost efficiency of language models. A model named Haiku, described as the 'cheap and fast model' and comparable to GPT-3.5 but with capabilities as good as the earlier GPT-4 model, has been introduced. This development is seen as a breakthrough due to its cost efficiency, being 24-40 times cheaper than GPT-4 Turbo and offering dirt-cheap inference with GPT-4 performance. Experts suggest that the economic implications of such advancements are enormous, with OpenAI expected to release a new model that outperforms GPT-4 in reasoning capabilities while being much cheaper than GPT-3.5 Turbo. Additionally, the gap between small and huge models is closing, with Haiku only 80 points behind Opus compared to the 140-point gap between GPT-4 and 3.5T. This trend towards more efficient and cheaper models is further evidenced by the reduced cost of training a GPT3.5 - Llama2 level model, now only $10M and two months, a significant decrease from the cost just a year ago, which was 10-20 times more expensive according to OAI.
Just $10M and two months to train from scratch a GPT3.5 - Llama2 level model. For context, it probably cost 10-20x more to OAI just a year ago! The more we improve as a field thanks to open-source, the cheaper & more efficient it gets! All companies should now train their own… https://t.co/3xnoNwt03p
That's the exact opposite IMO! $10M to train a GPT3.5 level model whereas it probably cost OAI at least 10-20x more just a year or two ago. The more we improve as a field thanks to open-source, the cheaper & more efficient it gets to produce the same capabilities. Let's go… https://t.co/bEtUXvVb4n
Huge models that barely beat GPT 3.5 aren't as interesting as 7B models that can beat GPT 3.5 or Claude Haiku, which offers dirt-cheap inference and GPT-4 perf. Fine-tuning is also best done on smol models.