How my life completely changed within four months
Hi, my name is Sebastian Baltes,
My research falls into what I call the triangle of empirical software engineering. My first research projects focused on a specific observation I made while working as a software developer in industry: the widespread usage of informal diagrams without proper tool support. I empirically studied the problem and developed tool prototypes to address the identified issues. I then moved towards interdisciplinary topics related to the legal implications of code plagiarism and knowledge transfer from psychology. While integrating an interdisciplinary perspective, the studied phenomena were always rooted in central software engineering problems and were thoroughly empirically studied before proposing solutions in form of tool prototypes or process adaptations.
The guiding theme of my research has always been the idea that thoroughly analyzing and understanding the state-of-practice is an essential first step towards improving how software is being developed. Too often, we still see decisions in software projects being rather based on opinions than being informed by empirical data. My industry experience – in the past as a student as well as recently after moving back from Australia to Germany – helps me to consider both the perspectives of researchers as well as practitioners. My research focuses on problems that I consider relevant for software developers in practice, but in addition to that I also put effort into communicating results back to practitioners by giving talks in user groups, companies, as well as other channels such as social media.
I consider myself a methods pluralist. To complement qualitative results derived from interviews, observational studies, or open-ended survey questions, I apply data-mining techniques to open source software projects or other data sets, including data shared by companies under non-disclosure agreements. I further maintain the open dataset SOTorrent that other researchers are utilizing to study the origin, evolution, and usage of Stack Overflow content. This dataset was selected as the official mining challenge of MSR 2019. I am also interested in information visualization and visual analytics, exploring how interactive visualizations can support humans in analysing data. I regularly develop custom visualization that we have been using in different research projects to explore data or to derive patterns. I follow open science and open data best practices, meaning that I try to publish data, software, analysis scripts, and paper preprints whenever possible.
Consider submitting to the Empirical Software Engineering journal, where I'm a member of the editorial board.
A Simple Web Scraper for Journals and Conference Proceedings
A Study of Airbnb Listings in Two Berlin Neighborhoods
A Conceptual Theory
Retrieval using the SOTorrent Dataset