South Korean researchers have developed a new way to make Artificial Intelligence models acknowledge unfamiliarity with topics in a manner intended to resemble human behaviour. Researchers from the Korea Advanced Institute of Science and Technology say the advance could improve the reliability of Artificial Intelligence systems used in areas such as autonomous driving and medicine, where overconfident mistakes can carry serious consequences.
Previous research has identified Artificial Intelligence overconfidence as a major risk when such systems are used to support decisions, especially in medical diagnosis. Commonly used models such as OpenAI’s ChatGPT have been shown to hallucinate, or make up facts, because they are incentivised to produce guesses instead of admitting a lack of knowledge. Researchers say a fundamental cause of this overconfidence lies in how models learn from initial data through artificial neural networks. Small errors introduced at that stage can spread through later training and lead to significant mistakes. Researchers found that when random data was input into a neural network during the initialisation phase, the model exhibited high confidence despite not having learned anything. This led to hallucination.
To address the problem, the researchers drew on clues from human brain development. In humans, brain signals are generated without external input even before birth, which helps manage the issue. Mimicking this, scientists developed a system in which the neural network backbone of an Artificial Intelligence model underwent brief pre-training with random noise inputs before actual learning. According to the researchers, this warm-up process helps the model establish a baseline by adjusting its own uncertainty before it begins learning from data.
The warm-up process can help an Artificial Intelligence model set its initial confidence to a low level close to chance and significantly reduce its overconfidence bias. Researchers say the method helps models first learn the state of “I don’t know anything yet”. They reported that conventional models tend to give incorrect answers with high confidence even for data they have not encountered during training, while models with warm-up training showed clearer ability to lower confidence and recognise that they do not know. Se-Bum Paik, an author of the study published in Nature Machine Intelligence, said the findings show that incorporating key principles of brain development can help Artificial Intelligence recognise its own knowledge state in a more human-like way and better understand when it is uncertain or might be mistaken.
