What Don’t We Know About Agentic AI?

2 minute read

Original call on LinkedIn
Follow-up on LinkedIn

On 9 and 10 June 2026, the SE@UHD group met in Heidelberg for a small research retreat. Our doctoral and postdoctoral researchers presented their current work as posters, and we used the rest of the time to look ahead: Which questions should we work on next, especially around software development with agentic AI tools?

Before the retreat, I asked my LinkedIn network one question: What don’t we know about software development with agentic AI tools, but should? And by “know,” I meant proper empirical evidence, not anecdotes, demos, or blog posts. Thanks again to everyone who shared ideas, papers, and concerns. We brought that input into the retreat discussions.

The outcome is hard to condense, but four themes helped us structure the conversation:

Organization and processes: How do team structures, responsibilities, and coordination mechanisms change when software becomes cheaper to create but remains expensive to understand, verify, operate, and maintain? Who pays that downstream cost, and where does it show up in organizations?

Engineering and craft: How can we integrate AI tools into software engineering workflows in a way that reliably improves the resulting software? Which deterministic checks can control increasingly non-deterministic workflows? When does an agent need more context, and when does more context only add noise? Where are the boundaries beyond which an AI agent should not work independently?

Cognition and skills: How do the roles of junior and senior developers change? What happens to ownership, beyond the legal meaning of the term? What are the long-term effects of cognitive offloading on software development skills? How do developers decide which claims about AI tools to believe, and which ones to distrust?

Reuse: Does code quality still matter in the same way if code can be generated quickly? How should we handle the one-off tools that AI agents write and execute during development sessions? Are we seeing a new category of “throwaway software”? And are code clones acceptable if they were generated by AI, or do they create the same maintenance risks as before?

We also discussed these questions with guests from SAP, hei_INNOVATION, and the Institute of Computer Science. The retreat format helped: Posters made ongoing work visible, and structured discussions such as 1-2-4-All, and 15% Solutions moved us from broad questions toward more concrete collaboration plans.

For me, the main takeaway is that agentic AI in software engineering is not only a tooling topic. It is also about process, evaluation, cognition, skills, maintenance, and accountability. There are many claims about what these tools can do. There is still much less evidence about how they change the work of software development, which benefits actually materialize, and which costs are merely shifted elsewhere.

That gap is exactly where empirical software engineering research should contribute.

Group retreat participants on the Mathematikon terrace
_{SE@UHD group retreat, June 2026.}

Share on

Twitter Facebook LinkedIn

Sebastian Baltes

Share on