Simon Willison’s latest posts tagged with artificial intelligence span hands-on experiments, product notes, and security analysis across the current wave of large language model tooling. A recurring theme is practical agentic coding: he highlights how Anthropic’s Claude Code answers questions about itself by fetching its own documentation via a dedicated index file, and he demonstrates how instructing it to use sub-agents can parallelize complex tasks against a codebase. He also documents where Claude Code logs are stored locally and how to extend their retention, and shares a workflow video where he vibe-codes a small utility using Claude Code for web.
Several entries focus on Anthropic’s expanding stack. Willison previews and evaluates Claude Code for web as an asynchronous coding agent, and he explores the newly introduced Claude Skills pattern for packaging model abilities. He covers the release of Claude Haiku 4.5, noting its support for reasoning, a 200,000 token context window, higher maximum output, and a February 2025 knowledge cutoff. He cites the system card’s description of training interventions that make the model more context-aware, helping it avoid stopping prematurely and reducing agentic laziness. Alongside tooling, he spotlights Anthropic’s interpretability research that finds cross-modal features for visual concepts shared between ASCII art, SVG, and text, and shows how identified features can be negatively or positively steered to change generated imagery.
Willison gives detailed attention to browser agents and security. He tests OpenAI’s ChatGPT Atlas, a Mac-only browser that adds chat context from pages, a user-controlled browser memories feature, and an agent mode with clearly stated boundaries and ARIA guidance for site authors. He remains skeptical about the safety of browser automation and calls for deep explanations of defenses against prompt injection, later linking to a follow-up by OpenAI’s chief information security officer addressing Atlas risks. Complementing that, he summarizes new research from Brave demonstrating “unseeable” prompt injections in screenshots and text-triggered exfiltration in competing agentic browsers, and he cites commentary arguing that prompt injection may be fundamentally hard to solve in current systems.
Beyond agents, he logs numerous project-driven notes. Highlights include porting the classic SLOCCount tool to WebAssembly using WebPerl and Emscripten, brute-forcing a setup to run DeepSeek-OCR on an NVIDIA Spark with Claude Code’s help, and trying OpenAI’s deep research model while building a viewer for Responses API traces. He also tracks a legal update relieving OpenAI from preserving all ChatGPT outputs in an ongoing case with specific exceptions. Across the posts, Willison combines critical security scrutiny with pragmatic engineering, emphasizing concrete workflows that make today’s models more reliably useful.
