Small errors reveal structural problems in artificial intelligence

David Riedman argues that trivial prompt failures in large language and image models expose deeper flaws in attention-based systems, with implications that extend from misleading student work to fatal self-driving car crashes.

David Riedman uses a series of everyday failures involving large language models and automated tools to argue that small errors point to foundational weaknesses in contemporary artificial intelligence systems. He describes a graduate student who unknowingly built a thesis proposal around a social theory that does not exist because of an artificial intelligence hallucination, artificial intelligence security cameras that sent another false alert about a gun at a school, and automated multi-paragraph email suggestions in his university account that filled replies with random and unrelated information. He warns that these examples create real risks, from ending academic careers through fake citations, to putting lives at risk when police rush to schools on false alarms, to damaging trust between faculty and students when fabricated emails reference nonexistent assignments or meetings.

Riedman then details an image-editing experiment using Gemini Nano Banana Pro in which repeated prompts to add a friend into a group photo produced increasingly absurd and incorrect outputs. His first prompt, “add my friend to this photo,” caused the model to follow the instruction without contextual understanding. A second prompt, “add my friend to the line of people and don’t change any faces or features,” produced a version that captured the group photo concept but made his friend 7 feet tall and altered everyone’s faces. A third attempt, “make everyone the same height and don’t change the faces,” led the model to generate two 10-foot versions of his friend, delete one person while leaving an extra hand, and change the faces of three people on the left. He attributes these failures to four critical issues: “salience latch,” where the reasoning chain anchors on the wrong subject; “same height” being misread as a scaling problem instead of a placement adjustment; the model regenerating a full image instead of composing elements; and a “visual budget” tradeoff where style, pose, and context are preserved at the expense of accurately rendering faces or even keeping all subjects.

Using these “goofy pictures” as an analogy, Riedman connects the same attention-driven logic breakdowns to fatal incidents in Tesla’s self-driving system, which relies solely on camera-based computer vision instead of spatial sensors like LiDAR. He notes that Tesla self-driving uses only computer vision with cameras to avoid the cost of spatial sensors like LiDAR that give actual proximity data, and he argues that this can lead to the same kind of scale and salience errors that made his friend appear 10 feet tall. He states that Tesla has killed at least 772 drivers, pedestrians, and cyclists and describes how early latch, path dependence, and compounding error can cause the system to prioritize lane markers over obstacles, resulting in crashes into concrete barriers at 80mph or misjudgments of tractor trailers where the car either misestimates distance and size or treats the space under the trailer as a clear path. In contrast, he highlights that a Waymo self-driving taxi has never been involved in a serious crash or fatality and reports estimates that the Waymo has about $60,000 in sensors while Tesla’s cameras are less than $1,000, framing Tesla’s approach as a deliberate cost-cutting choice that increases risk.

Riedman links these outcomes to the transformer architecture introduced in the 2017 Google paper “Attention Is All You Need,” explaining that self-attention lets models assign weighted focus to inputs in a way that is probabilistic and context-dependent rather than logically guaranteed. Once a model locks onto an initially salient token or object, downstream reasoning is conditioned on that choice, making errors self-reinforcing. He contrasts this with human problem-solving, where people use intuition, double-check information, and ask clarifying questions instead of trying to eliminate uncertainty at the outset. He is skeptical that better prompting or more data will resolve these issues, arguing that additional prompts can intensify feedback loops when objectives such as identity preservation, spatial realism, and height constraints conflict, and that models are optimizing to minimize loss under compute constraints, not to minimize real-world error. In his view, artificial intelligence systems fail not primarily because of hallucinations or insufficient data, but because they decide the wrong thing matters and then do not recover from a doomed chain of decisions, a pattern that becomes especially dangerous when models control physical systems like cars or act autonomously in high-stakes environments.

55

Impact Score

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.