OpenAI shifts toward a fully automated Artificial Intelligence researcher

OpenAI is making a fully automated Artificial Intelligence researcher its central research goal, combining work on reasoning models, agents, and interpretability. The company expects an autonomous research intern by September and a broader multi-agent research system in 2028.

OpenAI is reorganizing its research agenda around what it calls an Artificial Intelligence researcher, a fully automated agent-based system designed to tackle large, complex problems with minimal human guidance. The effort is now the company’s stated “North Star” for the next few years and is intended to unify work on reasoning models, agents, and interpretability. OpenAI plans to build “an autonomous AI research intern” by September, a system that can take on a small number of specific research problems by itself. That system is meant to lead to a fully automated multi-agent research system that the company plans to debut in 2028.

The intended scope is broad. OpenAI envisions systems that could work on math and physics problems, contribute to biology and chemistry, and address business or policy questions, as long as the task can be expressed in text, code, or whiteboard sketches. Chief scientist Jakub Pachocki argues that recent progress suggests models are approaching the ability to work coherently for extended periods, with humans still setting goals and remaining in charge. He points to Codex as an early version of the broader concept, noting that OpenAI claims most of its technical staffers now use the tool in their work. Pachocki says the near-term target is a system that can handle delegated tasks that would take a person a few days.

That ambition builds on recent advances in coding agents and reasoning models. Pachocki points to the leap from 2020’s GPT-3 to 2023’s GPT-4 as evidence that greater general capability also improves how long models can work without help. He also says reasoning models, which step through problems and backtrack when needed, have extended how long systems can stay effective on difficult tasks. OpenAI is also training models on complex examples such as hard math and coding puzzles so they learn to manage large amounts of text and break work into multiple subtasks. Researchers have used GPT-5 to discover new solutions to a number of unsolved math problems and to push through dead ends in biology, chemistry, and physics puzzles, though Pachocki acknowledges the technology is not yet reliable enough to hand complete control to the system.

Outside researchers see promise but also significant obstacles. Doug Downey of the Allen Institute for AI says coding agents have made the idea of automated scientific work more plausible, but warns that multi-step research workflows remain fragile because errors compound when tasks must be chained together. He notes that OpenAI’s latest model, GPT-5, performed best in his group’s testing on scientific tasks but still made lots of errors, and that OpenAI released GPT-5.4 two weeks ago, which could already change those results.

Safety and governance remain unresolved. Pachocki says a system capable of running an entire research program raises serious questions about misuse, hacking, and misunderstanding instructions. OpenAI’s main current safeguard is chain-of-thought monitoring, in which reasoning models record intermediate notes that researchers can inspect to judge whether behavior is aligned with expectations. He says highly capable systems should operate in sandboxes and under restrictions for a long time because fully trustworthy control is still out of reach. Pachocki also warns that such systems could concentrate unprecedented power in a small number of hands, making government involvement essential even as the broader debate over military and other sensitive uses remains unsettled.

70

Impact Score

NVIDIA DLSS 5 uses 2D frames and motion vectors

NVIDIA has outlined DLSS 5 as a system that takes 2D frames and motion vectors as input, then uses a generative Artificial Intelligence model to produce its final output. The approach focuses on 2D imagery rather than full 3D scene generation to improve computational efficiency.

CSEM France pushes responsible Artificial Intelligence

CSEM France is positioning itself as a key force in France’s push for responsible Artificial Intelligence, combining technical research with ethics, policy engagement, and industry partnerships. Its work centers on trustworthy systems designed for transparency, fairness, and public accountability.

Eu parliament backs ban on Artificial Intelligence nudifier apps

European parliament committees have endorsed changes to the Artificial Intelligence Act that would ban apps used to create non-consensual nude or sexually explicit images of real people. Lawmakers also backed delays and targeted adjustments to compliance rules for high-risk systems and watermarking requirements.

Chancellor sets principles for UK-EU alignment

Rachel Reeves has outlined a growth plan built around closer UK-EU ties, faster Artificial Intelligence adoption, and stronger regional development. The strategy sets new principles for regulatory alignment, expands support for innovation, and shifts more investment power to city regions.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.