Researchers at MIT have introduced FlowER (Flow matching for Electron Redistribution), a generative Artificial Intelligence system designed to predict chemical reaction mechanisms while explicitly enforcing physical constraints such as conservation of mass and electrons. The work, reported Aug. 20 in Nature and described in an MIT news release on Sept. 3, 2025, adapts a bond-electron matrix representation originally developed by chemist Ivar Ugi to keep track of bonds and lone electron pairs. That matrix uses nonzero values to represent bonds or lone pairs and zeros to represent their absence, enabling the model to conserve both atoms and electrons through predicted mechanistic steps.
The team — including Joonyoung Joung, Mun Hong Fong, Nicholas Casetti, Jordan Liles, Ne Dassanayake, and senior author Connor Coley — trained the model on data drawn from more than a million chemical reactions in a U.S. Patent Office database. FlowER is intended to address a limitation of large language models, which can treat atoms as tokens and thereby spuriously add or delete atoms in a predicted reaction unless constrained. By anchoring reactants and products to experimentally validated patent literature and imputing underlying mechanistic steps, the researchers report large increases in validity and conservation and matching or slightly better accuracy in mechanism prediction relative to existing systems.
The authors note current limitations: the training data contain few reactions involving certain metals and many catalytic cycles, so FlowER’s coverage of those chemistries is limited. The team plans to expand the model’s understanding of metals and catalytic cycles in future work. The system and datasets, including an exhaustive mechanistic dataset developed by Joung, are being released open source on GitHub. The researchers say FlowER could be useful for medicinal chemistry, materials discovery, combustion, atmospheric chemistry, and electrochemical systems, and view this work as a proof of concept toward broader mechanistic discovery. Funding came from the Machine Learning for Pharmaceutical Discovery and Synthesis consortium and the National Science Foundation.