Chemma language model accelerates reaction optimisation but sparks debate

Chemma, a new large language model, promises faster reaction predictions for chemists but raises questions about overreliance on Artificial Intelligence in lab work.

Chemma, a large language model (LLM) specifically designed for organic chemistry, is transforming how researchers approach reaction prediction and synthesis planning. Developed by Yanyan Xu and colleagues at Shanghai Jiao Tong University, Chemma was trained on over 1.28 million chemistry question-and-answer pairs derived from public datasets. Its purpose is multi-faceted: it predicts reaction outcomes, devises retrosynthetic routes to target molecules, and recommends optimal conditions for chemical reactions. The model is built on the open-source Llama-2-7B framework and is tailored for active, feedback-loop learning, enabling it to iterate and improve recommendations using results from real experiments coupled with expert guidance.

The practical impact of Chemma is already evident. In a key demonstration, Chemma was used to identify optimal conditions for a previously unexplored Suzuki–Miyaura cross-coupling reaction, achieving a 67% yield in just 15 experiments—a process that would traditionally require hundreds of trials and weeks of work. Chemma’s performance surpasses older approaches in single-step retrosynthesis and yield prediction. Unlike conventional methods that rely on computationally intensive quantum-chemical calculations, Chemma provides fast, data-driven predictions that allow researchers to screen conditions in minutes on a standard laptop.

Despite the rapid progress and promise of such models, Chemma’s emergence has drawn scrutiny from the scientific community. Some chemists, including independent commentators Kevin Jablonka and Joshua Schrier, acknowledge Chemma´s efficiency but urge caution, stressing that these tools generate probabilistic outputs and lack genuine chemical intuition. There is concern that researchers may become overly reliant on models, risking a decline in critical thinking and expertise. Critics highlight the dangers of centralising scientific knowledge within automated systems and call for diversity, transparency, and continued human oversight in chemical research. The consensus is that language models like Chemma should be seen as powerful tools to assist, not replace, the nuanced judgement and responsibility of chemists.

75

Impact Score

IBM’s defense large language model built with Janes data

IBM and Janes have developed a large language model fine-tuned for defense that queries continuously refreshed, human-vetted Janes data and can be deployed in air-gapped, classified, and edge environments. The product is positioned as an Artificial Intelligence decision-support tool for military planners and defense industry users.

Character.Artificial Intelligence to ban children under-18 from chatbots

Character.Artificial Intelligence will phase out open-ended chats for under-18 users, starting with a two-hour daily cap on October 29, 2025, and a full ban on November 25, 2025. The company will use an age-assurance model and Persona ID checks to distinguish minors from adults.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.