Inside the UK’s artificial intelligence security institute

The UK's artificial intelligence security institute has found that popular frontier models can be jailbroken at scale, exposing reliability gaps and security risks for governments and regulated industries that rely on trusted vendors.

The UK’s artificial intelligence security institute (AISI), formerly the artificial intelligence safety institute, was established by the UK government in 2024 to investigate both the capabilities and risks of frontier artificial intelligence models. In new evaluations, the organisation reports that even the most advanced systems can be misused, challenging assumptions about vendor trust and model safety just as adoption of artificial intelligence accelerates. AISI has focused on how models perform in technical tasks such as biological research and software development, while also assessing how easily they can be manipulated to support illicit activity.

So far, AISI has published detailed performance tests on two widely used models: OpenAI o1 and Claude 3.5 Sonnet. OpenAI’s first reasoning model, o1, performed comparably overall to the startup’s internal reference model, GPT-4o, according to AISI, which also observed similar inherent cybersecurity risks in both systems. However, o1 suffers from several unique reliability and tooling problems, and for general reasoning and coding, o1 underperformed against GPT-o4, although the two models were near equals in technical areas such as biological research. Claude 3.5 Sonnet far outperformed other models in biological research, as well as in engineering and reasoning tasks, but its guardrails are less robust and AISI identified several ways to jailbreak the system to elicit dangerous responses.

Beyond these public case studies, AISI has evaluated 22 anonymised models, with 1.8 million total attempts to break safeguards and perform illicit tasks, and every model tested was vulnerable to jailbreaks, with the organisation identifying more than 62,000 harmful behaviours. For regulated sectors such as finance, healthcare, legal services and the public sector, these findings significantly raise the stakes for artificial intelligence governance and security, underlining that organisations can no longer delegate risk management entirely to trusted vendors and must instead run capability assessments, stress tests and red-teaming. Existing guidance from bodies such as the Financial Conduct Authority and the NHS is expected to be updated in light of AISI’s results, and businesses in all industries are being urged to factor these risks into their artificial intelligence strategies at a time when the market for enterprise scams is growing and attackers are rapidly learning how to exploit artificial intelligence frameworks.

Regulation is still lagging behind in the UK, which unlike the EU, which enacted the EU artificial intelligence act in 2024, has no single legislation to guide or restrain the use of artificial intelligence, meaning AISI’s government-backed insights remain nonbinding. The institute’s evaluation methods are also not standardised internationally, with regulators and safety bodies in different jurisdictions using their own tests, a gap that has led some observers to argue that no single assessment regime can yet be used to declare an artificial intelligence model, or the wider sector, definitively safe or unsafe. OpenAI and Anthropic voluntarily submitted their models for AISI evaluations but reiterated concerns about misalignment between the UK institute and its US counterpart, the center for artificial intelligence standards and innovation. As pressure builds on governments to coordinate and align these frameworks, firms adopting artificial intelligence are being warned that safety cannot be assumed, even when partnering with the most established suppliers.

68

Impact Score

EU Artificial Intelligence Act omnibus deal delays high-risk rules

A provisional EU agreement would push back key high-risk Artificial Intelligence Act deadlines while keeping major transparency duties on track for 2 August 2026. The deal also adds a new ban on non-consensual intimate imagery and child sexual abuse material generated by Artificial Intelligence systems.

UK and EU Artificial Intelligence regulatory outlook for May 2026

The UK is moving ahead with targeted Artificial Intelligence measures in policing, online safety, cyber security and copyright policy, while the EU is refining how the EU Artificial Intelligence Act will apply in practice. Consultations, new offences and implementation deadlines are shaping the next phase of compliance on both sides.

Germany sets out national implementation of the Artificial Intelligence Act

Germany has published a draft law to implement the European Artificial Intelligence Act through new supervisory structures, clearer institutional responsibilities, and measures designed to support innovation. The proposal puts the Federal Network Agency at the center of enforcement while preserving sector-specific oversight in sensitive fields.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.