The UK’s artificial intelligence security institute (AISI), formerly the artificial intelligence safety institute, was established by the UK government in 2024 to investigate both the capabilities and risks of frontier artificial intelligence models. In new evaluations, the organisation reports that even the most advanced systems can be misused, challenging assumptions about vendor trust and model safety just as adoption of artificial intelligence accelerates. AISI has focused on how models perform in technical tasks such as biological research and software development, while also assessing how easily they can be manipulated to support illicit activity.
So far, AISI has published detailed performance tests on two widely used models: OpenAI o1 and Claude 3.5 Sonnet. OpenAI’s first reasoning model, o1, performed comparably overall to the startup’s internal reference model, GPT-4o, according to AISI, which also observed similar inherent cybersecurity risks in both systems. However, o1 suffers from several unique reliability and tooling problems, and for general reasoning and coding, o1 underperformed against GPT-o4, although the two models were near equals in technical areas such as biological research. Claude 3.5 Sonnet far outperformed other models in biological research, as well as in engineering and reasoning tasks, but its guardrails are less robust and AISI identified several ways to jailbreak the system to elicit dangerous responses.
Beyond these public case studies, AISI has evaluated 22 anonymised models, with 1.8 million total attempts to break safeguards and perform illicit tasks, and every model tested was vulnerable to jailbreaks, with the organisation identifying more than 62,000 harmful behaviours. For regulated sectors such as finance, healthcare, legal services and the public sector, these findings significantly raise the stakes for artificial intelligence governance and security, underlining that organisations can no longer delegate risk management entirely to trusted vendors and must instead run capability assessments, stress tests and red-teaming. Existing guidance from bodies such as the Financial Conduct Authority and the NHS is expected to be updated in light of AISI’s results, and businesses in all industries are being urged to factor these risks into their artificial intelligence strategies at a time when the market for enterprise scams is growing and attackers are rapidly learning how to exploit artificial intelligence frameworks.
Regulation is still lagging behind in the UK, which unlike the EU, which enacted the EU artificial intelligence act in 2024, has no single legislation to guide or restrain the use of artificial intelligence, meaning AISI’s government-backed insights remain nonbinding. The institute’s evaluation methods are also not standardised internationally, with regulators and safety bodies in different jurisdictions using their own tests, a gap that has led some observers to argue that no single assessment regime can yet be used to declare an artificial intelligence model, or the wider sector, definitively safe or unsafe. OpenAI and Anthropic voluntarily submitted their models for AISI evaluations but reiterated concerns about misalignment between the UK institute and its US counterpart, the center for artificial intelligence standards and innovation. As pressure builds on governments to coordinate and align these frameworks, firms adopting artificial intelligence are being warned that safety cannot be assumed, even when partnering with the most established suppliers.
