Training artificial intelligence models to assimilate new knowledge

MIT researchers developed a method that lets large language models generate study-like synthetic data and update their weights to permanently internalize new information, a step toward self-improving artificial intelligence.

Large language models deployed today cannot permanently learn from new information the way humans do. The article explains that once a model’s training is complete its internal weights remain static, so information provided during a conversation does not persist across sessions. Although models perform well at in-context learning, where they use examples within a single interaction to guide responses, that knowledge disappears when the session ends.

Researchers at MIT introduced a framework called SEAL for “self-adapting LLMs” that aims to teach models how to update their own weights using synthetic training data. The model rewrites incoming information into multiple self-edits, akin to a student creating study sheets, and then evaluates each self-edit by quizzing itself on downstream tasks. Using a reinforcement learning approach, the model rewards the self-edits that produce the largest performance gains and then applies the best edit to its weights so the knowledge is internalized.

In experiments the SEAL method improved accuracy on question-answering tasks by nearly 15 percent and boosted success rates on some skill-learning tasks by more than 50 percent, with a small model outperforming much larger models on certain benchmarks. The authors note a key limitation: catastrophic forgetting, where adapting to new information can erode prior knowledge. The team plans further work to mitigate forgetting and to explore multi-agent settings in which models teach each other. The research was led by MIT students and faculty and will be presented at the Conference on Neural Information Processing Systems, with support from several funding agencies including the U.S. Army Research Office and the U.S. Air Force AI Accelerator.

68

Impact Score

Artificial Intelligence speeds quantum encryption threat timeline

Research from Google and Oratomic suggests quantum computers capable of breaking core internet encryption may arrive sooner than expected. Artificial Intelligence played a key role in improving one of the new algorithms, raising fresh urgency around post-quantum security.

New methods aim to improve Large Language Model reasoning

A new study on arXiv outlines algorithmic techniques designed to strengthen Large Language Model reasoning and reduce hallucinations. The work reports better logical consistency and stronger performance on mathematical and coding benchmarks.

Nvidia acquisition of SchedMD raises Slurm neutrality concerns

Nvidia’s purchase of SchedMD has given it control of Slurm, an open-source scheduler that sits at the center of many supercomputing and large-model training systems. Researchers and engineers are watching for signs that support could tilt toward Nvidia hardware over AMD and Intel alternatives.

Mustafa Suleyman says Artificial Intelligence compute growth is still accelerating

Mustafa Suleyman argues that Artificial Intelligence development is being propelled by simultaneous advances in chips, memory, networking, and software efficiency rather than nearing a hard limit. He contends that rising compute capacity and falling deployment costs will push systems beyond chatbots toward more capable agents.

China and the US are leading different Artificial Intelligence races

The US leads in large language models and advanced chips, while China has built a major advantage in robotics and humanoid manufacturing. That balance is shifting as Chinese developers narrow the gap in model performance and both countries push to combine software and machines.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.