Self-Admitted GenAI Usage in Open-Source Software
Does GenAI adoption lead to an increased code churn, as suggested by GitClear’s reports?
We could not confirm this using self-admitted GenAI usage as an analytical lens. That is, we searched for mentions of GenAI usage in open-source software projects, manually labeled them, and then used the timestamps of the first documented GenAI usage as a proxy for GenAI adoption in a particular project. Using an RDD analysis, we then assessed how code churn evolves pre- and post GenAI adoption.
Besides this, we discuss GenAI-related policies in the projects we analyzed, present a taxonomy of assisted tasks, generated content types, and usage purposes, and discuss implications of our results for researchers and open-source software developers.
The paper preprint is available on arXiv.
Thanks to my co-authors Tao Xiao, Youmei Fan, Fabio Calefato, Christoph Treude, Raula Gaikovina Kula, and Hideaki Hata.