Experts in artificial intelligence are expressing concerns about the current trajectory of auto-regressive large language models (LLMs) and their ability to achieve human-level AI. Yann LeCun highlighted that the sources of reliable data are becoming scarce, and the costs associated with manual post-training are increasing. He stated that while these models are not sufficient to reach human-level AI, they still hold utility. Andrew Ng, another prominent figure in machine learning, echoed these sentiments, asserting that artificial general intelligence (AGI) is likely 'many decades away, maybe even longer.' He criticized companies that claim AGI will be realized in a year or two, suggesting they are using non-standard definitions to set lower expectations. Additionally, discussions around the potential of synthetic data as a complement to traditional data sources are gaining traction, with some experts arguing that it could support the development of future models like Llama 4/5 and GPT-5/6. Despite the availability of numerous datasets, only a small fraction are synthetic, indicating a need for more innovative data solutions to sustain scaling laws in AI development.
OK, here is my best guess on the state of LLMs: - Chasing benchmark scores steers us further away from achieving AGI - gpt-5 and llama-4 can still be much better without being trained on 10x or 100x more data - We're running out of human tokens, but synthetic data is here to the… https://t.co/kJTDPP0UqB https://t.co/xSRNtOx5PM
Machine learning pioneer Andrew Ng says AGI is still "many decades away, maybe even longer" and companies that say it is only a year or two away are using non-standard definitions to lower the bar https://t.co/zTERJM0tTr
.@ylecun about AGI: "Auto-Regressive LLMs in their current form will not take us to human-level AI. That doesn't mean they are not useful." https://t.co/DfalKIf3ZL