Artificial intelligence chatbots cite retracted scientific papers

Studies and tests show that popular Artificial Intelligence chatbots and research tools often cite retracted papers without warning, risking the spread of flawed findings. Companies are adding retraction data, but gaps and inconsistent publisher notices complicate fixes.

Some Artificial Intelligence chatbots are drawing on retracted scientific papers to answer questions, according to recent studies and tests confirmed by MIT Technology Review. While fabrication of links and references is a known issue, even accurate citations can mislead when the underlying papers have been retracted and answers do not disclose that status. Researchers warn that this poses risks as the public uses chatbots for medical advice and as students and scientists adopt science-focused Artificial Intelligence tools. The US National Science Foundation invested in building Artificial Intelligence models for science research in August, suggesting such usage will grow.

In one study, Weikuan Gu and colleagues queried OpenAI’s ChatGPT running GPT-4o with prompts based on 21 retracted medical imaging papers. The chatbot referenced retracted papers in five cases and advised caution in only three. Another study in August used ChatGPT-4o mini to evaluate 217 retracted and low-quality papers across fields and found that none of the responses mentioned retractions or other concerns. No similar studies have been released on GPT-5. Yuanxi Fu argues that retraction status is an essential quality indicator for tools serving the general public, and OpenAI did not provide a response to requests for comment on the results.

The problem extends beyond ChatGPT. In June, MIT Technology Review tested research-oriented tools including Elicit, Ai2 ScholarQA, Perplexity, and Consensus using questions based on the same 21 retracted papers. Elicit cited five retracted papers, Ai2 ScholarQA 17, Perplexity 11, and Consensus 18, none with explicit retraction warnings. Some providers have since responded. Consensus says it has integrated retraction data from publishers, aggregators, web crawling, and Retraction Watch, and a retest in August saw it cite five retracted papers. Elicit removes retracted items flagged by OpenAlex and is expanding sources. Ai2 says its tool does not automatically detect or remove retractions, while Perplexity notes it does not claim to be 100 percent accurate.

Experts caution that retraction databases remain incomplete and labor intensive to maintain. Ivan Oransky of Retraction Watch says a truly comprehensive database would require significant resources and manual curation. Publisher practices also vary widely, using labels such as correction, expression of concern, erratum, and retracted for different reasons, which complicates automated detection. Papers can persist across preprint servers and repositories, and models may rely on outdated training data. Most academic search engines do not perform real-time checks against retraction data, leaving accuracy at the mercy of their corpora.

Suggested remedies include adding more context for models and users, such as linking journal-commissioned peer reviews and critiques on PubPeer alongside papers. Many publishers, including Nature and the BMJ, post retraction notices outside paywalls, and companies are urged to better leverage such signals as well as news coverage of retractions. Until systems improve, experts say both creators and users of Artificial Intelligence tools must exercise skepticism and due diligence.

68

Impact Score

Nvidia DGX Spark arrives for world’s Artificial Intelligence developers

Nvidia is shipping DGX Spark, a compact desktop system that delivers a petaflop of Artificial Intelligence performance and unified memory to bring large model development and agent workflows on premises. Partner systems from major PC makers and channel partners broaden availability starting Oct. 15.

EU regulatory developments on the Artificial Intelligence Act

The European Commission finalized a General Purpose Artificial Intelligence Code of Practice and signaled phased enforcement of the Artificial Intelligence Act. Companies gain transitional breathing room but should use it to align with new transparency, copyright, and safety expectations.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.