BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Study Shows AI Coding Assistant Improves Developer Productivity

Study Shows AI Coding Assistant Improves Developer Productivity

Researchers from Microsoft, MIT, Princeton University, and the Wharton School of the University of Pennsylvania recently published a study that showed the use of GitHub Copilot increased developer productivity. The team conducted three separate randomized controlled trials (RCT) involving over 4,000 developers; the ones using Copilot achieved a 26% increase in productivity.

The three experiments were performed at Microsoft, Accenture, and an "anonymous Fortune 100 electronics manufacturing company." For each of the 4,867 developers in the study, the researchers measured the weekly number of pull requests, commits, and code builds performed. They found that developers using Copilot had an average increase of 26.08% in the number of pull requests completed per week. They also found that productivity varied by developer experience, with less experienced developers getting more benefit from Copilot. According to the research team:

Our work complements both the literature on lab experiments as well as these observational studies by studying the impact of generative AI using a field experiment in an actual workplace setting. To date, there is still a dearth of experimental studies examining the effect of generative AI in a field setting.

The experiments were conducted in 2022 and 2023, using a version of Copilot based on GPT-3.5. At Microsoft and Accenture, developers in the experiment were randomly selected to use Copilot, while in the anonymous company, all devs were granted access eventually, but with randomly selected start dates. In addition to tracking the developer productivity measures, the researchers tracked Copilot adoption and usage.

The research team analyzed their results across all devs as well as by developer tenure and skill level. They found that short-tenured and junior devs were more likely to adopt Copilot and to continue using it for more than one month, and that these devs were more likely to accept the output code generated by Copilot. They also experienced the most productivity gain from the tool.

Wharton professor Ethan Mollick shared the results in a thread on X, writing:

We now have randomized controlled trials showing large performance gains in real companies for coding, management, entrepreneurship, and writing using AI.

In a discussion about the study on Hacker News, several users shared that the paper's results matched their own experience with Copilot. One user wrote:

The most interesting thing about this study for me is that when they break it down by experience levels, developers who are above the median tenure show no statistically significant increase in productivity...Copilot is nice for resolving some tedium and freeing up my brain to focus more on deeper questions, but it's not as world-altering as junior devs describe it as. It's also frequently subtly wrong in ways that a newer dev wouldn't catch, which requires me to stop and tweak most things it generates in a way that a less experienced dev probably wouldn't know to.

The effect of generative AI on employee productivity, and specifically software developer productivity, is an open research area. Earlier this year, InfoQ covered a survey by Upwork Research Institute, where a majority of employees surveyed actually reported that GenAI decreased their productivity. InfoQ also covered a study by eBay where GitHub Copilot did increase developer productivity.

About the Author

Rate this Article

Adoption
Style

BT