Artificial intelligence coding tools reduced productivity for experienced engineers, study finds

A METR study found some experienced developers became less productive when using Artificial Intelligence coding assistants, contradicting common industry assumptions.

Artificial intelligence code editors such as Cursor and other generative software development tools have rapidly been adopted by major tech firms including Amazon, Microsoft, and Google. But a recent study from Model Evaluation & Threat Research (METR) suggests their effects are not uniformly positive—especially for experienced software developers. The research, conducted with 16 seasoned programmers who managed large open-source repositories, found that those equipped with Artificial Intelligence coding tools completed tasks 19% slower, on average, than their non-AI-assisted peers.

The study´s methodology randomly assigned developers to two groups: one allowed to use Artificial Intelligence coding assistance (with a range of tools, but mostly Cursor with Claude 3.5/3.7 Sonnet), and one prohibited from doing so. A notable outcome was a disconnect between perception and reality: Artificial Intelligence-assisted developers believed, even after the tasks, that their productivity had improved by an estimated 20%. In contrast, the actual data showed a marked slowdown. Developers working without Artificial Intelligence spent over 10% more time actively writing code, whereas their Artificial Intelligence-using counterparts spent over 20% more time reviewing, prompting, waiting on outputs, or idling—a significant shift in activity patterns. Notably, AI-generated code was accepted less than half the time, and developers reported spending about 9% of their time cleaning up machine outputs.

Reactions to the findings highlight their nuance. METR researcher Nate Rush admitted he was surprised by the negative result, cautioning that these outcomes reflect the study´s specific context: all participants were highly experienced, and Artificial Intelligence code assistants might perform better for less seasoned programmers. Steve Newman, Google Docs cofounder, initially found the results too negative to be true, but eventually considered the research credible after reviewing its methods. Critics, including a developer participant, underscore that coding assistants have evolved quickly since the study period in February 2025. METR itself stresses the data is ´point-in-time,´ encouraging developers to use Artificial Intelligence more judiciously, based on self-awareness of the actual productivity impact. The broader lesson: productivity gains from Artificial Intelligence tools are likely individualized, and overconfidence in their capabilities can lead to unexpected slowdowns—even for the most skilled engineers.

75

Impact Score

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.

Please check your email for a Verification Code sent to . Didn't get a code? Click here to resend