Benchmarking large language model vulnerability to insecure code via few-shot inversion

Researchers introduce a novel few-shot inversion method to systematically evaluate how large language models generate insecure code, creating the first open benchmark for code security risks in Artificial Intelligence systems.

Large language models for code generation, such as ChatGPT, Codex, and GitHub Copilot, have rapidly become foundational tools for programmers, thanks to their impressive ability to automate code writing and completion. However, the data these models are trained on often includes unsanitized code from open-source repositories, which might harbor vulnerabilities and flaws. While prior evaluations have focused on the functional correctness of generated code, systematic analysis of the security posture of such outputs has been lacking, presenting an urgent risk as these models are integrated into development workflows globally.

The study, conducted by researchers at the CISPA Helmholtz Center for Information Security, introduces a new method for probing the security weaknesses of code-generating language models. Their technique uses few-shot prompting to approximate model inversion within a black-box context—meaning the model´s internal workings remain opaque. By providing a handful of example prompts and corresponding vulnerable code snippets, the model is guided to identify prompts that can consistently trigger the generation of insecure code patterns. This technique enables automated, large-scale discovery of vulnerabilities in the outputs of state-of-the-art models, surpassing previous manual or one-off prompt engineering approaches by facilitating high-throughput, vulnerability-specific benchmarking.

In real-world trials, the researchers´ method generated a diverse dataset of over 2,000 prompts that led models to produce Python and C code with critical security flaws. The findings show that such prompts are often transferable across models, exacerbating the systemic risk posed by code-generating Artificial Intelligence. To encourage security benchmarking and improvements, the team has released both their methodology and the resulting dataset as an open-source toolkit, empowering the research and development community to evaluate and compare the security performance of various language models. This framework is extensible, enabling the detection of new vulnerability classes as they emerge and offering a practical path forward for integrating robust security checks into the evolving landscape of Artificial Intelligence-powered code generation.

77

Impact Score

Intel details disaggregated Core Ultra Series 3 Panther Lake H die

Intel’s Core Ultra Series 3 Panther Lake H mobile processors use a disaggregated multi-tile design that splits compute, graphics, and I/O across different process nodes. The layout closely follows Lunar Lake, with variations in graphics tiles between mainstream and ultraportable configurations.

Pentagon surveillance powers collide with artificial intelligence limits

A dispute between the Pentagon and leading artificial intelligence companies is exposing how far US surveillance law lags behind modern data collection and analysis capabilities. Contracts, not legislation, are currently setting the boundaries for military use of powerful artificial intelligence tools.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.