MIT researchers present SEAL, advancing self-improving language models

MIT unveils SEAL, a novel framework that lets large language models self-edit and adapt using reinforcement learning—pushing the frontier of self-improving Artificial Intelligence.

MIT researchers have introduced SEAL (Self-Adapting LLMs), a groundbreaking approach that empowers large language models to autonomously update their own parameters. The new framework, detailed in the paper ´Self-Adapting Language Models´, centers around the concept of self-generated data: the model creates and applies its own training samples—or self-edits—through carefully designed reinforcement learning loops. By tying performance rewards to downstream tasks, the model learns which self-edits are most beneficial for continuous improvement.

The SEAL method operates through a nested structure. The outer loop uses reinforcement learning to guide the generation of effective self-edits, while the inner loop updates the model using supervised fine-tuning based on these edits. Initially, researchers observed instability with standard policy optimization methods, ultimately favoring a more robust behavioral cloning strategy (ReST^EM) inspired by work at DeepMind. This process filters self-edits based on observed performance gains before incorporating them. While the current design uses a single model for generating and learning from edits, future iterations could separate these into distinct ´teacher´ and ´student´ models.

SEAL was put to the test in domains such as knowledge integration and few-shot learning. Results were notable: in few-shot learning with a Llama-3.2-1B-Instruct model, SEAL improved adaptation success rates dramatically, reaching over 70 percent success compared to more conventional approaches. For knowledge integration, the Qwen2.5-7B model effectively assimilated new facts, outpacing baseline and previous reinforcement learning methods, sometimes exceeding even setups using GPT-4.1-generated data. The researchers highlighted how reinforcement learning not only boosted quantitative outcomes but also enabled the model to generate more nuanced, task-relevant self-edits. Despite the promise, challenges remain—particularly with catastrophic forgetting, computational costs, and context-aware evaluation, all of which the team discusses in their publication.

This work emerges amid a surge of global interest in self-evolving Artificial Intelligence, with parallel projects like Sakana AI´s Darwin-Gödel Machine and OpenAI´s speculation on recursive self-improvement capturing widespread attention. SEAL stands out as a concrete and experimentally validated step towards autonomous, self-improving language technologies, offering a glimpse at the ongoing transformation of the field.

84

Impact Score

LLM-PIEval: a benchmark for indirect prompt injection attacks in large language models

Large language models have increased interest in Artificial Intelligence and their integration with external tools introduces risks such as direct and indirect prompt injection. LLM-PIEval provides a framework and test set to measure indirect prompt injection risk and the authors release API specifications and prompts to support wider assessment.

NVIDIA may stop bundling memory with gpu kits amid gddr shortage

NVIDIA is reportedly considering supplying only bare silicon to its aic partners rather than the usual gpu and memory kit as gddr shortages constrain fulfillment. The move follows wider industry pressure from soaring dram prices and an impending price increase from AMD of about 10% across its gpu lineup.

SK Hynix to showcase 48 Gb/s 24 Gb GDDR7 for Artificial Intelligence inference

SK Hynix will present a 24 Gb GDDR7 chip rated for 48 Gb/s at ISSCC 2026, claiming a symmetric dual-channel design and updated internal interfaces that push past the expected 32 to 37 Gb/s. The paper positions the device for mid-range Artificial Intelligence inference and SK Hynix will also show LPDDR6 running at 14.4 Gb/s.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.