How my life completely changed within four months
Hi, my name is Sebastian Baltes,
My research falls into what I call the triangle of empirical software engineering. From the beginning of my career, I have been interested in empirically studying phenomena observed in software engineering practice, to derive actionable insights and requirements for better tool support. Today, in my industry role, I regularly utilize empirical research methods to identify problems with existing software development processes and tools, with the goal of deriving guidelines and requirements. With that knowledge, I then build tool prototypes to address the identified issues. Further, most research projects I work on have an interdisciplinary angle, including my previous work with connections to psychology, law, economics, and management.
Exemplary for this approach to research is the first project I worked on during my PhD. While working as a software developer in industry, I noticed the widespread usage of informal diagrams without proper tool support. Triggered by this observation, I studied the problem empirically and then developed tool prototypes addressing the identified issues ( project). Interdisciplinary topics I worked on include legal implications of code plagiarism ( project) and knowledge transfer from psychology to software engineering ( project). While including an interdisciplinary perspective, the studied phenomena were always rooted in central software engineering problems and thoroughly studied empirically before proposing solutions in form of tool prototypes or suggested process adaptations.
The guiding theme of my research has always been the idea that thoroughly analyzing and understanding the state-of-practice is an essential first step towards improving how software is being developed. Too often, we still see decisions in software projects being rather opinion-based than data-informed. My considerable industry experience helps me to balance the perspectives of (academic) researchers, software engineering practitioners, as well as other stakeholders. My research focuses on problems that I consider relevant for software developers in practice, but in addition to that I also put effort into communicating results back to practitioners by giving talks in user groups, companies, as well as using other channels such as social media.Besides research projects rooted in software development practice, I am also interested in the meta-scientific discourse within the software engineering research community. Early in my PhD research, I noticed potential ethical issues of common practices in software engineering research as well as a lack of knowledge when it comes to central aspects of empirical research in general (e.g., sampling). Therefore, I am part of ACM SIGSOFT's Empirical Standards initiative that maintains a continuously evolving collection of guidelines for the various empirical methods being utilized in software engineering research.
I consider myself a methods pluralist. To complement qualitative results derived from interviews, observational studies, or open-ended survey questions, I apply data-mining techniques to open source software projects or other data sets, including data shared by companies under non-disclosure agreements. From 2018 until 2021, I further maintain the open dataset SOTorrent that other researchers utilized to study the origin, evolution, and usage of Stack Overflow content ( project). This dataset was selected as the official mining challenge of MSR 2019. I am also interested in information visualization and visual analytics, exploring how interactive visualizations can support humans in analyzing data. I regularly develop custom visualization that we have been using in different research projects—also within SAP—to explore data or to derive patterns. I follow open science and open data practices by publishing data, software, analysis scripts, and paper preprints whenever possible.
Consider submitting to the Empirical Software Engineering journal, where I'm a member of the editorial board.
A Simple Web Scraper for Journals and Conference Proceedings
A Study of Airbnb Listings in Two Berlin Neighborhoods
A Conceptual Theory
Retrieval using the SOTorrent Dataset