DeepMind Proposes CaMeL Defense Against LLM Prompt Injection

Google DeepMind introduces CaMeL, a security layer that applies traditional software security concepts to large language models, effectively blocking many prompt injection attacks in real-world agent benchmarks.

Google DeepMind researchers have introduced CaMeL, a new defense mechanism designed to protect large language models (LLMs) against prompt injection attacks originating from untrusted inputs. The CaMeL framework acts as a defense layer around LLMs, intercepting and neutralizing potentially malicious queries before they can exploit the model. In benchmark tests using AgentDojo, a security suite for autonomous agents, CaMeL was able to block 67% of prompt injection attacks, demonstrating considerable effectiveness over current solutions.

Prompt injection attacks allow adversaries to manipulate LLMs by crafting context or instructions that cause models to exfiltrate sensitive information or execute unintended actions, such as sending unauthorized emails or leaking private data. Conventional defenses rely on more Artificial Intelligence to monitor or detect malicious prompts, but attackers have repeatedly found ways to circumvent these measures, as seen in successful phishing attacks bypassing even the latest LLM security features.

Distinctively, CaMeL applies established software security principles—such as control flow integrity, access control, and information flow control—to LLM interactions. It uses a custom Python interpreter to track the origin and permissible actions associated with all data and instructions encountered by a privileged LLM, without requiring modification of the LLM itself. By leveraging the Dual LLM pattern, where one LLM handles untrusted inputs in quarantine and another privileged LLM enforces workflows and access rights, CaMeL builds a data flow graph and attaches security metadata to all variables and program data. This metadata defines what actions are authorized, ensuring that output from untrusted sources cannot be misused even if manipulated upstream.

While the approach reduces the need for Artificial Intelligence-driven security layers, making detection less probabilistic and more deterministic, it is not a silver bullet. The researchers point out limitations such as the need for users to define security policies themselves and the risk of user fatigue from manual approval of sensitive tasks. Nevertheless, CaMeL´s results highlight the merit of augmenting LLM security with well-understood software security methodologies, providing a promising avenue for reducing systemic risk in production LLM deployments.

77

Impact Score

IBM and AMD partner on quantum-centric supercomputing

IBM and AMD announced plans to develop quantum-centric supercomputing architectures that combine quantum computers with high-performance computing to create scalable, open-source platforms. The collaboration leverages IBM´s work on quantum computers and software and AMD´s expertise in high-performance computing and Artificial Intelligence accelerators.

Qualcomm launches Dragonwing Q-6690 with integrated RFID and Artificial Intelligence

Qualcomm announced the Dragonwing Q-6690, billed as the world’s first enterprise mobile processor with fully integrated UHF RFID and built-in 5G, Wi-Fi 7, Bluetooth 6.0, ultra-wideband and Artificial Intelligence capabilities. The platform is aimed at rugged handhelds, point-of-sale systems and smart kiosks and offers software-configurable feature packs that can be upgraded over the air.

Recent books from the MIT community

A roundup of new titles from the MIT community, including Empire of Artificial Intelligence, a critical look at Sam Altman’s OpenAI, and Data, Systems, and Society, a textbook on harnessing Artificial Intelligence for societal good.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.