Anthropic reported that engineers detected and disrupted what the company described as a ‘largely autonomous’ campaign that pointed its Claude model at targets across tech, finance, and government. Company representatives said the model carried out roughly 80-90% of a broad reconnaissance-and-exploitation effort affecting 30 organizations worldwide, with humans intervening for high-level decisions such as choosing targets and deciding when to pull stolen data. Anthropic said its monitoring and abuse-detection systems flagged unusual patterns of automated task-chaining, and that attackers tried to bypass guardrails by decomposing malicious goals into apparently benign penetration-testing steps. In published examples the model produced errors, including hallucinated findings and invalid credentials.
Not all experts accept the framing of a near-complete autonomous attack. Mike Wilkes of Columbia University and NYU called the technical content of the intrusions ‘trivial’ but said the orchestration element is novel, flipping the narrative toward human-augmented Artificial Intelligence. Seun Ajao at Manchester Metropolitan University said many details ring true, including use of task decomposition and the need to correct hallucinated outputs, but argued the autonomy was likely overstated. Katerina Mitrokotsa at the University of St. Gallen characterised the incident as a hybrid operation in which an AI acted as an orchestration engine under human direction, and said claims that the model did 90% of the work are hard to accept given the reported errors that required manual correction.
Despite disagreement over whether Claude performed 80-90%, 50%, or far less of the work, experts agree on the broader implication: even partial Artificial Intelligence-driven orchestration lowers the barrier to entry for espionage, increases scalability, and blurs lines of responsibility. If Anthropic’s account is accurate, consumer-facing models can accelerate reconnaissance, compress the time from scanning to exploitation, and enable faster, repeatable campaigns. The most likely scenario, based on the reporting and expert commentary, is a human-led operation supercharged by an AI assistant that stitched together reconnaissance, exploit drafts, and code generation; defenders should expect more hybrid operations that multiply human capability rather than fully replace it.
