Small errors reveal structural problems in artificial intelligence

David Riedman argues that trivial prompt failures in large language and image models expose deeper flaws in attention-based systems, with implications that extend from misleading student work to fatal self-driving car crashes.

David Riedman uses a series of everyday failures involving large language models and automated tools to argue that small errors point to foundational weaknesses in contemporary artificial intelligence systems. He describes a graduate student who unknowingly built a thesis proposal around a social theory that does not exist because of an artificial intelligence hallucination, artificial intelligence security cameras that sent another false alert about a gun at a school, and automated multi-paragraph email suggestions in his university account that filled replies with random and unrelated information. He warns that these examples create real risks, from ending academic careers through fake citations, to putting lives at risk when police rush to schools on false alarms, to damaging trust between faculty and students when fabricated emails reference nonexistent assignments or meetings.

Riedman then details an image-editing experiment using Gemini Nano Banana Pro in which repeated prompts to add a friend into a group photo produced increasingly absurd and incorrect outputs. His first prompt, “add my friend to this photo,” caused the model to follow the instruction without contextual understanding. A second prompt, “add my friend to the line of people and don’t change any faces or features,” produced a version that captured the group photo concept but made his friend 7 feet tall and altered everyone’s faces. A third attempt, “make everyone the same height and don’t change the faces,” led the model to generate two 10-foot versions of his friend, delete one person while leaving an extra hand, and change the faces of three people on the left. He attributes these failures to four critical issues: “salience latch,” where the reasoning chain anchors on the wrong subject; “same height” being misread as a scaling problem instead of a placement adjustment; the model regenerating a full image instead of composing elements; and a “visual budget” tradeoff where style, pose, and context are preserved at the expense of accurately rendering faces or even keeping all subjects.

Using these “goofy pictures” as an analogy, Riedman connects the same attention-driven logic breakdowns to fatal incidents in Tesla’s self-driving system, which relies solely on camera-based computer vision instead of spatial sensors like LiDAR. He notes that Tesla self-driving uses only computer vision with cameras to avoid the cost of spatial sensors like LiDAR that give actual proximity data, and he argues that this can lead to the same kind of scale and salience errors that made his friend appear 10 feet tall. He states that Tesla has killed at least 772 drivers, pedestrians, and cyclists and describes how early latch, path dependence, and compounding error can cause the system to prioritize lane markers over obstacles, resulting in crashes into concrete barriers at 80mph or misjudgments of tractor trailers where the car either misestimates distance and size or treats the space under the trailer as a clear path. In contrast, he highlights that a Waymo self-driving taxi has never been involved in a serious crash or fatality and reports estimates that the Waymo has about $60,000 in sensors while Tesla’s cameras are less than $1,000, framing Tesla’s approach as a deliberate cost-cutting choice that increases risk.

Riedman links these outcomes to the transformer architecture introduced in the 2017 Google paper “Attention Is All You Need,” explaining that self-attention lets models assign weighted focus to inputs in a way that is probabilistic and context-dependent rather than logically guaranteed. Once a model locks onto an initially salient token or object, downstream reasoning is conditioned on that choice, making errors self-reinforcing. He contrasts this with human problem-solving, where people use intuition, double-check information, and ask clarifying questions instead of trying to eliminate uncertainty at the outset. He is skeptical that better prompting or more data will resolve these issues, arguing that additional prompts can intensify feedback loops when objectives such as identity preservation, spatial realism, and height constraints conflict, and that models are optimizing to minimize loss under compute constraints, not to minimize real-world error. In his view, artificial intelligence systems fail not primarily because of hallucinations or insufficient data, but because they decide the wrong thing matters and then do not recover from a doomed chain of decisions, a pattern that becomes especially dangerous when models control physical systems like cars or act autonomously in high-stakes environments.

55

Impact Score

OpenClaw pushes autonomous Artificial Intelligence agents into enterprises

OpenClaw’s rapid growth is accelerating interest in persistent, self-hosted autonomous agents that run continuously instead of waiting for prompts. NVIDIA is positioning NemoClaw as a more secure reference implementation for organizations that want local control, auditability and hardened deployment defaults.

Indiana launches Artificial Intelligence business portal

Indiana is rolling out IN AI, a statewide portal meant to help employers adopt Artificial Intelligence with practical guidance, workshops and peer support. State leaders and business groups are positioning the effort as a way to raise productivity, wages and job growth while keeping workers at the center.

Goodfire launches model debugging tool for large language models

Goodfire has introduced Silico, a mechanistic interpretability platform designed to let developers inspect and adjust model behavior during development. The company is positioning it as a way to give smaller teams deeper control over open-source models and more trustworthy outputs.

Nvidia launches nemotron 3 nano omni for enterprise agents

Nvidia has introduced Nemotron 3 Nano Omni, a multimodal open model designed to support enterprise agents that reason across vision, speech and language. The launch extends Nvidia’s push beyond hardware into models and services while targeting more efficient agentic workflows.

Intel 18A-P node improves performance and efficiency

Intel plans to present new results for its 18A-P process at the VLSI 2026 Symposium, highlighting gains in performance, power efficiency, and manufacturing predictability. The updated node is positioned as a stronger option for customers seeking 18A density with better operating characteristics.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.