Artificial intelligence coding tools reduced productivity for experienced engineers, study finds

A METR study found some experienced developers became less productive when using Artificial Intelligence coding assistants, contradicting common industry assumptions.

Artificial intelligence code editors such as Cursor and other generative software development tools have rapidly been adopted by major tech firms including Amazon, Microsoft, and Google. But a recent study from Model Evaluation & Threat Research (METR) suggests their effects are not uniformly positive—especially for experienced software developers. The research, conducted with 16 seasoned programmers who managed large open-source repositories, found that those equipped with Artificial Intelligence coding tools completed tasks 19% slower, on average, than their non-AI-assisted peers.

The study´s methodology randomly assigned developers to two groups: one allowed to use Artificial Intelligence coding assistance (with a range of tools, but mostly Cursor with Claude 3.5/3.7 Sonnet), and one prohibited from doing so. A notable outcome was a disconnect between perception and reality: Artificial Intelligence-assisted developers believed, even after the tasks, that their productivity had improved by an estimated 20%. In contrast, the actual data showed a marked slowdown. Developers working without Artificial Intelligence spent over 10% more time actively writing code, whereas their Artificial Intelligence-using counterparts spent over 20% more time reviewing, prompting, waiting on outputs, or idling—a significant shift in activity patterns. Notably, AI-generated code was accepted less than half the time, and developers reported spending about 9% of their time cleaning up machine outputs.

Reactions to the findings highlight their nuance. METR researcher Nate Rush admitted he was surprised by the negative result, cautioning that these outcomes reflect the study´s specific context: all participants were highly experienced, and Artificial Intelligence code assistants might perform better for less seasoned programmers. Steve Newman, Google Docs cofounder, initially found the results too negative to be true, but eventually considered the research credible after reviewing its methods. Critics, including a developer participant, underscore that coding assistants have evolved quickly since the study period in February 2025. METR itself stresses the data is ´point-in-time,´ encouraging developers to use Artificial Intelligence more judiciously, based on self-awareness of the actual productivity impact. The broader lesson: productivity gains from Artificial Intelligence tools are likely individualized, and overconfidence in their capabilities can lead to unexpected slowdowns—even for the most skilled engineers.

75

Impact Score

Nvidia targets Windows PC chips with RTX Spark

Nvidia’s RTX Spark combines its Blackwell RTX GPU and Grace CPU for Windows PCs, pushing the company deeper into markets led by Intel, AMD, and Qualcomm. The move could intensify competition around Arm-based processors and local Artificial Intelligence workloads.

Devin Desktop turns Windsurf into an agent command center

Cognition has renamed Windsurf as Devin Desktop, positioning the IDE as a unified surface for managing coding agents. The product keeps the existing editor experience while adding multi-agent workflows, shared context, and cloud handoff features.

NVIDIA advances U.K. sovereign Artificial Intelligence push

NVIDIA is positioning the U.K.’s sovereign Artificial Intelligence effort as a shift from policy to deployment, with new compute plans, startup funding and enterprise projects. The push spans cloud infrastructure, life sciences, coding, inference and developer training.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.