Research
This page provides an overview of my previous and ongoing research projects, arranged along seven main research areas. Many projects we work on focus on software analytics, that is, processing, analyzing, and visualizing software engineering data to monitor, govern, and improve software development processes and tools. We are further interested in interdisciplinary research and methodological aspects of empirical software engineering. We are convinced that thoroughly analyzing and understanding the state-of-practice is an essential first step towards improving how software is being developed. Our vision is that software engineering becomes more evidence-based, which is only possible if academic researchers provide actionable insights on topics that are relevant to practitioners. You can find more information on my personal notion of empirical software engineering in my research statement and in my inaugural lecture at the University of Bayreuth, which is available online.
- Research overview (video of inaugural lecture)
- Data-driven decision making in software engineering
- Human-centric software engineering
- Mining software repositories
- Meta-scientific issues in software engineering research
- Developer experience and tool support
- Behavioral studies of software developers
- Interdisciplinary research
Research overview (video of inaugural lecture)
Data-driven decision making in software engineering
Projects in this research area are include empirical studies performed or tool prototypes built to support data-driven decision making in software projects. In our paper Taming Timeout Flakiness: An Empirical Study of SAP HANA, for example, we empirically analyzed the impact of test case timeouts on test flakiness and proposed and evaluated a cost-optimal alternative to the commonly used baseline of developer-defined timeout values. Our paper Visually Analyzing Company-wide Software Service Dependencies: An Industrial Case Study reports how we developed a tailored visualization for company-wide software service dependencies that was used to guide service deprecation and retirement. With our global Pandemic Programming study, we were among the first researchers to provide data-driven recommendations for organizations to better understand and support their developers in a forced work from home setup.
- Do Test and Environmental Complexity Increase Flakiness? An Empirical Study of SAP HANA
- Taming Timeout Flakiness: An Empirical Study of SAP HANA
- Visually Analyzing Company-wide Software Service Dependencies: An Industrial Case Study
- Pandemic Programming: How COVID-19 affects software developers and how their organizations can help
Human-centric software engineering
Software is developed with and for a wide range of stakeholders. Better understanding how human behavior impacts software development and vice versa is another research area we work on. Our paper UX Debt: Developers Borrow While Users Pay we, for example, investigated the relationship of traditional code-centric notions of technical debt and the broader concept of user experience debt. Another example is our paper “STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software Developers, where we studied the intersection of ageism and sexism in the software industry by interviewing veteran software developers belonging to marginalized genders.
- Making Software Development More Diverse and Inclusive: Key Themes, Challenges, and Future Directions
- Bridging Gaps, Building Futures: Advancing Software Developer Diversity and Inclusion Through Future-Oriented Research
- UX Debt: Developers Borrow While Users Pay
- “STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software Developers
- Challenges for Inclusion in Software Engineering: The Case of the Emerging Papua New Guinean Society
- Is 40 the new 60? How popular media portrays the employability of older software developers
- Towards a Theory of Software Development Expertise
Mining software repositories
Projects in this research area typically analyze data from (large) software repositories, including version control systems, issue tracking systems, or developer Q&A forums, with the goal of identifying patterns or deriving recommendations. In the past, we studied aspects such as links in commit messages, how developers use novel features such as GitHub Discussions, characterized how developers search for content on Stack Overflow, or if and how they attribute code snippets from Stack Overflow when they reuse them in GitHub projects.
- Do Test and Environmental Complexity Increase Flakiness? An Empirical Study of SAP HANA
- Taming Timeout Flakiness: An Empirical Study of SAP HANA
- 18 Million Links in Commit Messages: Purpose, Evolution, and Decay
- GitHub Discussions: An Exploratory Study of Early Adoption
- Characterizing Search Activities on Stack Overflow
- On the Diversity and Frequency of Code Related to Mathematical Formulas in Real-World Java Projects
- Code Duplication on Stack Overflow
- An Annotated Dataset of Stack Overflow Post Edits
- Usage and Attribution of Stack Overflow Code Snippets in GitHub Projects
- SOTorrent: Studying the Origin, Evolution, and Usage of Stack Overflow Code Snippets (see also the SOTorrent Project Page)
- SOTorrent: Reconstructing and Analyzing the Evolution of Stack Overflow Posts
- (No) Influence of Continuous Integration on the Commit Activity in GitHub Projects
- Attribution Required: Stack Overflow Code Snippets in GitHub Projects
Meta-scientific issues in software engineering research
As empirical software engineering matures as a discipline, meta-scientific discussions on how to best conduct empirical research in software engineering get more and more important. An issue we focused on early is the effectiveness and ethical implications of different strategies for sampling software developers. We later continued that line of work with a critical review of sampling in software engineering research. Another aspect we published about is secondary research in software engineering, which is still quite rudimentary compared to other disciplines. Recently, we investigated the challenges that software engineering researchers face when trying to publish interdisciplinary research. Moreover, we are working on evaluation guidelines for empirical studies involving LLMs (see also the project website).
- Not Real or Too Soft? On the Challenges of Publishing Interdisciplinary Software Engineering Research
- Towards Evaluation Guidelines for Empirical Studies involving LLMs
- Teaching Literature Reviewing for Software Engineering Research
- Paving the Way for Mature Secondary Research: The Seven Types of Literature Review
- Sampling in Software Engineering Research: A Critical Review and Guidelines
- Worse Than Spam: Issues In Sampling Software Developers
Developer experience and tool support
Developer experience refers to how the interplay of people (developers and other stakeholders) and the environment (processes, tools, culture, technology) positively or negatively affect different parts of the software development lifecycle. This broad area touches “micro” aspects such as developer’s local tooling but also “macro” aspects such as documentation ecosystems that developers rely on in their daily work. Related aspects are developer productivity, wellbeing, and satisfaction, all of which are difficult to accurately describe and operationalize. This category also shows that the research areas outlined on this page overlap, as our Pandemic Programming study also captured developers’ self-assessed productivity and wellbeing as central aspects. Other papers in this area presented novel tool prototypes, for example, to better integrate cost transparency in the development of cloud applications or to support developers in linking documentation and source code.
- A Penny a Function: Towards Cost Transparent Cloud Programming
- Automated Query Reformulation for Efficient Search Based on Query Logs from Stack Overflow
- Pandemic Programming: How COVID-19 affects software developers and how their organizations can help
- Round-Trip Sketches: Supporting the Lifecycle of Software Development Sketches from Analog to Digital and Back
- Linking Sketches and Diagrams to Source Code Artifacts
- RegViz: Visual Debugging of Regular Expressions
Behavioral studies of software developers
Traces of software developers’ behavior can be found in existing repositories and artifacts (see mining software repositories), but it can also be studied using other empirical research methods such as lab and field studies, online surveys, and interviews. Our previous work in this area includes a study on how developers debug performance bugs in a pair programming setting, how they use sketches and diagrams in their daily work, and how they link documentation resources.
- Contextual Documentation Referencing on Stack Overflow
- Usage and Attribution of Stack Overflow Code Snippets in GitHub Projects
- Navigate, Understand, Communicate: How Developers Locate Performance Bugs
- Sketches and Diagrams in Practice
Interdisciplinary research
Interdisciplinary research in software engineering can mean applying concepts from other disciplines such as psychology or mathematics to software engineering problems. However, it can also mean working with researchers from other disciplines on problems rooted in software engineering or in their domain. It can also mean conducting a study focusing on software engineering, the results of which are then picked up in other disciplines as well.
- Applying Information Theory to Software Evolution
- From Full-fledged ERP Systems Towards Process-centric Business Process Platforms
- Pandemic Programming: How COVID-19 affects software developers and how their organizations can help
- Towards a Theory of Software Development Expertise
- Constructing Urban Tourism Space Digitally: A Study of Airbnb Listings in Two Berlin Neighborhoods
- Visual Analysis and Coding of Data-Rich User Behavior
- VisualCues: Visually Explaining Source Code in Computer Science Education
- CodeBasket: Making Developers’ Mental Model Visible and Explorable