Inside the UK’s artificial intelligence security institute

The UK's artificial intelligence security institute has found that popular frontier models can be jailbroken at scale, exposing reliability gaps and security risks for governments and regulated industries that rely on trusted vendors.

The UK’s artificial intelligence security institute (AISI), formerly the artificial intelligence safety institute, was established by the UK government in 2024 to investigate both the capabilities and risks of frontier artificial intelligence models. In new evaluations, the organisation reports that even the most advanced systems can be misused, challenging assumptions about vendor trust and model safety just as adoption of artificial intelligence accelerates. AISI has focused on how models perform in technical tasks such as biological research and software development, while also assessing how easily they can be manipulated to support illicit activity.

So far, AISI has published detailed performance tests on two widely used models: OpenAI o1 and Claude 3.5 Sonnet. OpenAI’s first reasoning model, o1, performed comparably overall to the startup’s internal reference model, GPT-4o, according to AISI, which also observed similar inherent cybersecurity risks in both systems. However, o1 suffers from several unique reliability and tooling problems, and for general reasoning and coding, o1 underperformed against GPT-o4, although the two models were near equals in technical areas such as biological research. Claude 3.5 Sonnet far outperformed other models in biological research, as well as in engineering and reasoning tasks, but its guardrails are less robust and AISI identified several ways to jailbreak the system to elicit dangerous responses.

Beyond these public case studies, AISI has evaluated 22 anonymised models, with 1.8 million total attempts to break safeguards and perform illicit tasks, and every model tested was vulnerable to jailbreaks, with the organisation identifying more than 62,000 harmful behaviours. For regulated sectors such as finance, healthcare, legal services and the public sector, these findings significantly raise the stakes for artificial intelligence governance and security, underlining that organisations can no longer delegate risk management entirely to trusted vendors and must instead run capability assessments, stress tests and red-teaming. Existing guidance from bodies such as the Financial Conduct Authority and the NHS is expected to be updated in light of AISI’s results, and businesses in all industries are being urged to factor these risks into their artificial intelligence strategies at a time when the market for enterprise scams is growing and attackers are rapidly learning how to exploit artificial intelligence frameworks.

Regulation is still lagging behind in the UK, which unlike the EU, which enacted the EU artificial intelligence act in 2024, has no single legislation to guide or restrain the use of artificial intelligence, meaning AISI’s government-backed insights remain nonbinding. The institute’s evaluation methods are also not standardised internationally, with regulators and safety bodies in different jurisdictions using their own tests, a gap that has led some observers to argue that no single assessment regime can yet be used to declare an artificial intelligence model, or the wider sector, definitively safe or unsafe. OpenAI and Anthropic voluntarily submitted their models for AISI evaluations but reiterated concerns about misalignment between the UK institute and its US counterpart, the center for artificial intelligence standards and innovation. As pressure builds on governments to coordinate and align these frameworks, firms adopting artificial intelligence are being warned that safety cannot be assumed, even when partnering with the most established suppliers.

68

Impact Score

Artificial Intelligence, chips, and robots set the tone at CES 2026

CES 2026 in Las Vegas put Artificial Intelligence at the center of nearly every major announcement, with chipmakers and robotics firms using the show to preview their next wave of platforms and humanoid systems. Nvidia, AMD, Intel, Qualcomm, Google, Samsung, Hyundai, and Boston Dynamics all leaned on Artificial Intelligence to anchor their product strategies.

Siemens debuts digital twin composer for industrial metaverse deployments

Siemens has introduced digital twin composer, a software tool that builds industrial metaverse environments at scale by merging comprehensive digital twins with real-time physical data, enabling faster virtual decision making. Early deployments with PepsiCo report higher throughput, shorter design cycles and reduced capital expenditure through physics-accurate simulations and artificial intelligence driven optimization.

Cadence builds chiplet partner ecosystem for physical artificial intelligence and data center designs

Cadence has introduced a Chiplet Spec-to-Packaged Parts ecosystem aimed at simplifying chiplet design for physical artificial intelligence, data center and high performance computing workloads, backed by a roster of intellectual property and foundry partners. The program centers on a physical artificial intelligence chiplet platform and framework that integrates prevalidated components to cut risk and speed commercial deployment.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.