Jun 8, 11:49 AM

Apple Study Reveals AI Reasoning Limits; Amazon, IBM, Salesforce Report Major Agentic AI Productivity Gains

Recent research from Apple has raised questions about the capabilities of large language models (LLMs) and large reasoning models (LRMs) in handling complex tasks. The paper, titled 'The Illusion of Thinking,' finds that while standard LLMs perform well on low-complexity tasks, both LLMs and advanced reasoning models such as Claude, DeepSeek-R1, and o3-mini struggle or 'collapse' as task complexity increases. The study highlights that these AI models, despite sophisticated self-reflection mechanisms and increased token capacity, primarily rely on pattern matching rather than true reasoning. As complexity rises, their accuracy and ability to generalize diminish. Standard evaluation benchmarks may not fully capture reasoning quality, according to some researchers. In parallel, the adoption of agentic AI—systems of autonomous AI agents—is accelerating in enterprise and technology sectors. Reports show that AI agents are now responsible for significant portions of code generation, business process automation, and branding tasks. Tools like Lovart AI have enabled entrepreneurs to build brands such as Vegan Vogue, a luxury vegan handbag line, in under 10 minutes using a single prompt, complete with a PETA-approved vegan seal. Tech companies such as Amazon, Salesforce, and Cognizant report substantial productivity improvements and cost savings through the deployment of AI agents across coding, testing, and customer support functions. Amazon's use of its Q Developer Agent has reduced application upgrade times from weeks to hours and saved $260 million annually. IBM reports $3.5 billion in annual productivity run-rate savings, while Salesforce's CodeGenie model has processed over 7 million code lines and saved 30,000 monthly hours. Industry analysts note that the rise of agentic AI is transforming traditional software development, with low-code and no-code platforms enabling faster application development and more autonomous business process management.