New test probes language model confidence to flag misinformation

Researchers at Michigan State University have developed a method to detect when large language models give unreliable answers by testing how stable their internal representations are under small perturbations.

Large language models can present a serious risk in high stakes fields such as medicine when they provide an incorrect diagnosis and then insist the answer is correct. Overconfidence in these systems can mislead users who rely on the output for decisions that directly affect human health, safety and financial well being. The challenge is not only whether a model is accurate, but also whether it can signal when it does not actually know the right answer.

A team led by Mohammad Ghassemi, an assistant professor in the department of computer science and engineering at Michigan State University, has introduced a test designed to identify when a large language model is likely providing misinformation. The method, called CCPS, which stands for Calibrating LLM Confidence by Probing Perturbed Representation Stability, works by slightly perturbing the model’s internal state and then observing how consistently it responds. If small changes cause the answer to shift, you know the model probably was not reliable to begin with, offering a systematic way to detect overconfident but unstable outputs. Ghassemi said that the work makes large language models more accurate and honest about what they do and do not know.

In healthcare, the approach is positioned to support more reliable Artificial Intelligence models that can help predict neurological outcomes after cardiac arrest, optimize cancer treatment through radiomics and automate radiology quality control, while keeping physicians in control of final decisions. In finance, CCPS is described as providing enhanced risk assessment and market forecasting, helping decision makers detect emerging innovations, market disruptions and credit risks earlier. The technique is also framed as having broader potential to make Artificial Intelligence more dependable for policymakers, educators and researchers whose choices influence human well being, with references to further technical details and academic profiles for those interested in exploring the work in greater depth.

58

Impact Score

Who decides how America uses Artificial Intelligence in war

Stanford experts are divided over how the United States should govern Artificial Intelligence in defense, surveillance, and warfare. Their views converge on one point: decisions with such high stakes cannot be left to companies alone.

GPUBreach bypasses IOMMU on GDDR6-based NVIDIA GPUs

Researchers from the University of Toronto describe GPUBreach, a rowhammer attack against GDDR6-based NVIDIA GPUs that can bypass IOMMU protections. The technique enables CPU-side privilege escalation by abusing trusted GPU driver behavior on the host system.

Google Vids opens free video generation to all Google users

Google has made Google Vids available to anyone with a Google account, adding free access to video generation with its latest models. The move expands Google’s end-to-end video workflow and increases pressure on rivals that charge for similar tools.

Court warns against chatbot legal advice in Heppner case

A federal court found that chats with a publicly available generative Artificial Intelligence tool were not protected by attorney-client privilege or the work-product doctrine. The ruling highlights litigation risks when executives or employees use chatbots for legal guidance without lawyer supervision.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.