Anthropic´s new research exposes large language model vulnerabilities with SnitchBench

Anthropic´s research, using the creative SnitchBench benchmark, reveals that models from every major provider are vulnerable to prompt extraction attacks in the Artificial Intelligence landscape.

Anthropic has introduced new research that underscores vulnerabilities present in large language models across major providers. The study leverages a playful-yet-serious benchmark dubbed ´SnitchBench,´ inspired by Theo´s earlier prompt leakage tool, to evaluate how easily proprietary prompts can be extracted from popular Artificial Intelligence models.

The findings were stark: all leading models, regardless of origin, failed to prevent targeted extraction of their underlying prompts. This systematic weakness leaves proprietary and possibly sensitive prompt data exposed to prompt extraction attacks. The research demonstrates that these vulnerabilities are not isolated incidents or simple misconfigurations but represent a broader challenge across the current generation of language models.

SnitchBench works by automating the process of attempting to coax, trick, or otherwise manipulate a model into revealing the system prompt or other embedded content that ideally should remain undisclosed. Anthropic´s work has reignited a conversation around the privacy, security, and robustness of Artificial Intelligence model deployment. The results suggest a pressing need for the entire industry to bolster model safeguards and further invest in privacy-centric mitigation techniques before deploying these models into sensitive or mission-critical applications.

76

Impact Score

How hackers poison Artificial Intelligence business tools and defences

Researchers report attackers are now planting hidden prompts in emails to hijack enterprise Artificial Intelligence tools and even tamper with Artificial Intelligence-powered security features. With most organisations adopting Artificial Intelligence, email must be treated as an execution environment with stricter controls.

Meta unveils Business Artificial Intelligence as a 24/7 sales agent

Meta launched Business Artificial Intelligence, a customer assistant that lives across Facebook, Instagram and even third-party sites to answer questions, recommend products and guide checkout. The company is also rolling out generative Artificial Intelligence and creator tools to help brands produce targeted ads and scale influencer campaigns.

Latest Artificial Intelligence news in finance

Finextra’s Artificial Intelligence coverage this week spans central bank pilots, bank deployments, and new vendor products, plus insights from Sibos 2025 and a FinextraTV interview. Here are the key developments and themes.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.