Apple research exposes reasoning model collapse on complex problems

A new Apple study finds that today´s reasoning-focused Artificial Intelligence models fail catastrophically when faced with sufficiently complex logic puzzles.

Apple Machine Learning Research´s latest preprint, ´The Illusion of Thinking,´ scrutinizes how so-called reasoning models—dubbed Large Reasoning Models (LRMs)—handle logical problem solving. The team designed a controlled puzzle environment to bypass industry-standard, potentially misleading benchmarks. When tested on logic puzzles of varying complexity, the plain language model outperformed reasoning-enhanced versions on simpler tasks, but as puzzle complexity increased, the more advanced models briefly surpassed their standard counterparts.

The study reveals a stark limitation: as tasks turn truly challenging, both simple and advanced models experience a dramatic plunge in accuracy and effort. These models not only fail to produce correct answers but also demonstrate reduced output, effectively abandoning attempts to solve more complex logic puzzles. Even explicit guidance—providing the models with the precise algorithm necessary for a solution—did not overcome this barrier at high complexity levels.

This work builds on previous warnings from the same research group, underscoring that current language models do not perform genuine reasoning but instead mimic patterns learned from their training data. The Apple team’s critique is echoed by other researchers, notably Subbarao Kambhampati, who argues against equating intermediate token generation with actual thinking. The consensus: marketers and benchmarks masking these weaknesses do little to change the harsh reality that neural networks, no matter how sophisticated, remain bounded by their training and lack authentic reasoning capability once confronted with new, truly arduous challenges.

72

Impact Score

UK seeks EU tech pact to boost Artificial Intelligence ties

UK business and trade secretary Peter Kyle raised the prospect of a technology partnership with the EU covering Artificial Intelligence and other innovation sectors. The proposal is part of a broader effort to rebuild post-Brexit economic ties with Brussels.

NVIDIA and Doosan broaden physical Artificial Intelligence partnership

NVIDIA and Doosan Group are expanding work across robotics, autonomous equipment, power infrastructure and advanced materials. The partnership links NVIDIA accelerated computing platforms with Doosan businesses serving industrial automation, energy systems and data center hardware.

Chatbot liability suits test Artificial Intelligence safety law

A Florida lawsuit targeting ChatGPT’s maker signals a new product liability threat for Artificial Intelligence companies. The fight could turn on unsettled questions about platform immunity, speech protections, causation, and federal safety rules.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.