Stack Overflow Snippets in GitHub Projects

Stack Overflow Logo Creative Commons Logo GitHub Logo


Blog Post

Abstract

Stack Overflow (SO) is the largest Q&A website for developers, providing a huge amount of copyable code snippets. Using these snippets raises various maintenance and legal issues. The SO license requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatible license. While there is a heated debate on SO’s license model for code snippets and the required attribution, little is known about the extent to which snippets are copied from SO without proper attribution. We conducted a large-scale empirical study to analyze the usage and attribution of non-trivial code snippets from SO answers in public GitHub projects. Below, you’ll find supplementary material for the extended abstract published in the ICSE 2017 companion volume and for the corresponding EMSE journal paper. For information about the research design and results, please have a look at the posters and preprints available below the supplementary material.

Supplementary Material (EMSE Journal Paper)

Usage and Attribution of Stack Overflow Code Snippets in GitHub Projects — Supplementary Material.
Sebastian Baltes.
http://doi.org/10.5281/zenodo.1148069

The dataset is licensed under the Creative Commons Attribution Share-Alike 4.0 International License.

Supplementary Material (ICSE Extended Abstract)

License

The scripts and data we created as well as the data from the surveys are licensed under CC BY 4.0.

For data retrieved from the BigQuery GitHub data set, see the GitHub Terms of Service. All content retrieved from Stack Overflow, including content from the BigQuery Stack Overflow data set, is licensed under CC BY-SA 3.0, see also the Stack Exchange Network Terms of Service. GHTorrent is distributed under a dual licensing scheme (see GHTorrent FAQ and CCPlus).

Publication

Usage and Attribution of Stack Overflow Code Snippets in GitHub Projects.
Sebastian Baltes and Stephan Diehl.
Empirical Software Engineering (to appear).
Preprint arXiv

Talks and Posters

Attribution Required: Stack Overflow Code Snippets in GitHub Projects (Poster & Extended Abstract).
Sebastian Baltes, Richard Kiefer, and Stephan Diehl.
Proceedings of the 39th International Conference on Software Engineering Companion (ICSE 2017).
Preprint arXiv Poster BibTeX

Attribution Required: Stack Overflow Code Snippets in GitHub Projects (Poster).
Sebastian Baltes, Richard Kiefer, and Stephan Diehl.
39th International Conference on Software Engineering (ICSE 2017)
Poster

The documents distributed on this website have been provided by the contributing authors by means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright and the provided license. Not CC licensed works may not be reposted without the explicit permission of the copyright holder.