LLM-PIEval: a benchmark for indirect prompt injection attacks in large language models

Large language models have increased interest in Artificial Intelligence and their integration with external tools introduces risks such as direct and indirect prompt injection. LLM-PIEval provides a framework and test set to measure indirect prompt injection risk and the authors release API specifications and prompts to support wider assessment.

Large language models have become widely used in applications such as virtual assistants and smart home agents, driving broader interest in Artificial Intelligence. That same integration with external tools creates attackers’ opportunities, including direct prompt injection when malicious instructions appear in a user query and indirect prompt injection when harmful instructions are present in the retrieved information payload of retrieval augmented generation systems. The article notes indirect prompt injection carries particular risk because end users may not be aware of new attacks when they occur and detailed benchmarking of models on this threat remains limited.

To address that gap, the authors develop LLM-PIEval, a framework designed to measure any candidate large language model for its vulnerability to indirect prompt injection attacks. Using the framework the team created a new test set and used it to evaluate several state of the art large language models. The reported results show strong attack success rates across most evaluated models, demonstrating that indirect prompt injection is an active and measurable threat to current model deployments.

The authors release their generated test set together with API specifications and prompts to enable broader assessment of this risk in current large language models. By publishing these artifacts the work aims to make it easier for researchers and practitioners to evaluate model robustness to indirect prompt injection and to compare defenses and mitigations across systems. The paper frames LLM-PIEval as a practical, shareable resource to support more systematic security testing in conversational and retrieval augmented workflows.

58

Impact Score

Best artificial intelligence video generators for every creator

Leading artificial intelligence video tools like Sora, Veo 3, Adobe Firefly, Runway and Midjourney target different needs, from free social clips to commercially safe productions, but all come with legal and ethical tradeoffs. Choosing the right platform means balancing price, creative control, output quality and how each service handles your data and copyrights.

UK mps open inquiry into artificial intelligence and edtech in education

UK mps have launched a cross party inquiry into how artificial intelligence and education technology are reshaping learning across early years, schools, colleges and universities, and how government should balance innovation with safeguards. The education committee will examine opportunities to improve teaching and workload alongside risks around inequality, privacy, safeguarding and assessment.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.