Alibaba has launched the QwQ-32B-Preview, an open-source AI model developed by the Qwen team, which is designed to enhance reasoning capabilities. This experimental model features 32 billion parameters and is reported to outperform existing models such as Claude and GPT-4 in mathematical reasoning benchmarks. The QwQ model is now accessible via platforms like VSCode through CodeGPT and Ollama, and is available on HuggingChat. The release is seen as a significant advancement in open-source AI, with users noting its impressive performance in challenging reasoning tasks. However, it is also acknowledged that the model has certain limitations as it is still in the preview stage.
GPT-3.5 (text-davinci-003) dropped two years ago. It was pretty astonishing (you can see me start to experiment in the thread). GPT-4 represented about as much of a leap a few months later. Since then, improvements across a wide variety of areas, but nothing as shocking (yet). https://t.co/BgKaDQw0RU
And yet, none of these new models are significantly better than the original GPT-4 from March 2023. Writing code without LLM’s versus writing code with GPT-4 was a paradigm shift. Writing code with Claude 3.5 Sonnet versus writing code with GPT-4 is marginally better sometimes. https://t.co/6kn91DTeIM
A year ago nobody outside OpenAI had trained a model as good as GPT-4 Today there are dozens - and if you trust the benchmarks that includes some that you can run on a laptop (Qwen2.5-32B perhaps?) What changed? What techniques are used now that weren't known a year ago?