Artificial intelligence chatbots cite retracted scientific papers

September 25, 2025

Studies and tests show that popular Artificial Intelligence chatbots and research tools often cite retracted papers without warning, risking the spread of flawed findings. Companies are adding retraction data, but gaps and inconsistent publisher notices complicate fixes.

Some Artificial Intelligence chatbots are drawing on retracted scientific papers to answer questions, according to recent studies and tests confirmed by MIT Technology Review. While fabrication of links and references is a known issue, even accurate citations can mislead when the underlying papers have been retracted and answers do not disclose that status. Researchers warn that this poses risks as the public uses chatbots for medical advice and as students and scientists adopt science-focused Artificial Intelligence tools. The US National Science Foundation invested in building Artificial Intelligence models for science research in August, suggesting such usage will grow.

In one study, Weikuan Gu and colleagues queried OpenAI’s ChatGPT running GPT-4o with prompts based on 21 retracted medical imaging papers. The chatbot referenced retracted papers in five cases and advised caution in only three. Another study in August used ChatGPT-4o mini to evaluate 217 retracted and low-quality papers across fields and found that none of the responses mentioned retractions or other concerns. No similar studies have been released on GPT-5. Yuanxi Fu argues that retraction status is an essential quality indicator for tools serving the general public, and OpenAI did not provide a response to requests for comment on the results.

The problem extends beyond ChatGPT. In June, MIT Technology Review tested research-oriented tools including Elicit, Ai2 ScholarQA, Perplexity, and Consensus using questions based on the same 21 retracted papers. Elicit cited five retracted papers, Ai2 ScholarQA 17, Perplexity 11, and Consensus 18, none with explicit retraction warnings. Some providers have since responded. Consensus says it has integrated retraction data from publishers, aggregators, web crawling, and Retraction Watch, and a retest in August saw it cite five retracted papers. Elicit removes retracted items flagged by OpenAlex and is expanding sources. Ai2 says its tool does not automatically detect or remove retractions, while Perplexity notes it does not claim to be 100 percent accurate.

Experts caution that retraction databases remain incomplete and labor intensive to maintain. Ivan Oransky of Retraction Watch says a truly comprehensive database would require significant resources and manual curation. Publisher practices also vary widely, using labels such as correction, expression of concern, erratum, and retracted for different reasons, which complicates automated detection. Papers can persist across preprint servers and repositories, and models may rely on outdated training data. Most academic search engines do not perform real-time checks against retraction data, leaving accuracy at the mercy of their corpora.

Suggested remedies include adding more context for models and users, such as linking journal-commissioned peer reviews and critiques on PubPeer alongside papers. Many publishers, including Nature and the BMJ, post retraction notices outside paywalls, and companies are urged to better leverage such signals as well as news coverage of retractions. Until systems improve, experts say both creators and users of Artificial Intelligence tools must exercise skepticism and due diligence.

Source

68

Impact Score

Latest News

Qwen unleashed: this week’s breakthrough artificial intelligence models

September 25, 2025

Alibaba’s Qwen team rolled out new open-source models for coding, instruction following, and translation, paired with FP8 quantization and Apache 2.0 licensing. The roundup also spotlights notable artificial intelligence research and industry moves.

Meet the 2025 innovator of the year: Sneha Goenka’s ultra-fast genome sequencing

September 25, 2025

MIT Technology Review named Sneha Goenka its 2025 innovator of the year for designing the computations behind the world’s fastest whole-genome sequencing, enabling diagnoses in under eight hours. A recorded Roundtables session brings Goenka together with Leilani Battle and editor in chief Mat Honan.

Microsoft tests in-chip microfluidic cooling for Artificial Intelligence chips

September 25, 2025

Microsoft demonstrated an in-chip microfluidic cooling system that removes heat up to three times better than today’s cold plates, targeting the growing thermal load of Artificial Intelligence silicon.

Will agentic Artificial Intelligence disrupt SaaS?

September 25, 2025

Bain argues that agentic Artificial Intelligence is set to reshape software as a service by automating tasks and rebundling control across a new stack. The firm outlines four disruption scenarios and a playbook for incumbents on data, standards, pricing, and talent.

Intuit advances GenOS to accelerate agentic artificial intelligence development

September 24, 2025

Intuit expanded its GenOS platform with custom financial large language models, expert-in-the-loop tooling, and agent evaluation frameworks to speed agentic artificial intelligence across its products. Early results show improved accuracy and significantly lower latency in accounting workflows, with more agents rolling out soon.

Artificial intelligence chatbots cite retracted scientific papers

68

Impact Score

Latest News

Qwen unleashed: this week’s breakthrough artificial intelligence models

Meet the 2025 innovator of the year: Sneha Goenka’s ultra-fast genome sequencing

Microsoft tests in-chip microfluidic cooling for Artificial Intelligence chips

Will agentic Artificial Intelligence disrupt SaaS?

Intuit advances GenOS to accelerate agentic artificial intelligence development

Contact Us