New test probes language model confidence to flag misinformation

Researchers at Michigan State University have developed a method to detect when large language models give unreliable answers by testing how stable their internal representations are under small perturbations.

Large language models can present a serious risk in high stakes fields such as medicine when they provide an incorrect diagnosis and then insist the answer is correct. Overconfidence in these systems can mislead users who rely on the output for decisions that directly affect human health, safety and financial well being. The challenge is not only whether a model is accurate, but also whether it can signal when it does not actually know the right answer.

A team led by Mohammad Ghassemi, an assistant professor in the department of computer science and engineering at Michigan State University, has introduced a test designed to identify when a large language model is likely providing misinformation. The method, called CCPS, which stands for Calibrating LLM Confidence by Probing Perturbed Representation Stability, works by slightly perturbing the model’s internal state and then observing how consistently it responds. If small changes cause the answer to shift, you know the model probably was not reliable to begin with, offering a systematic way to detect overconfident but unstable outputs. Ghassemi said that the work makes large language models more accurate and honest about what they do and do not know.

In healthcare, the approach is positioned to support more reliable Artificial Intelligence models that can help predict neurological outcomes after cardiac arrest, optimize cancer treatment through radiomics and automate radiology quality control, while keeping physicians in control of final decisions. In finance, CCPS is described as providing enhanced risk assessment and market forecasting, helping decision makers detect emerging innovations, market disruptions and credit risks earlier. The technique is also framed as having broader potential to make Artificial Intelligence more dependable for policymakers, educators and researchers whose choices influence human well being, with references to further technical details and academic profiles for those interested in exploring the work in greater depth.

58

Impact Score

Intel Nova Lake compute tile sizes and cache strategy detailed

Intel’s upcoming Nova Lake Core Ultra Series 4 desktop processors will use TSMC’s N2 process for new compute tiles, including a premium variant with significantly enlarged cache to counter AMD’s 3D V-Cache. Early estimates suggest notable die size differences between standard and big cache configurations due to the expanded last level cache.

The security challenge of building trustworthy artificial intelligence assistants

New tools like OpenClaw show the appeal of always-on artificial intelligence assistants with deep access to personal data, but they also spotlight unresolved security risks, especially prompt injection attacks. Researchers are racing to design guardrails that protect users without stripping these agents of their usefulness.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.