Published on 21.10.2025

Bridging AI and Theory: Workshop on the Mathematics of Transformers at DESY

Group photo of the participants of the "Mathematics of Transformers" workshop, September 2025, Hamburg

Image: Workshop Group Picture | info

The term “Artificial intelligence” has evolved well beyond its original scope meaning and can nowadays be encountered in various research fields and everyday situations. The advent of chatbots has brought deep learning into public discourse, at the heart of these models lies a powerful key component: the transformer.

On September 26, 2025, DESY provided the stage for a workshop on this key building block of modern large language models. The underlying concept, known as self-attention, has empirically established itself as an efficient architectural choice. However, its inner mechanisms and the concrete reasons for its performance are yet to be understood in more depth.

A rapidly evolving field of mathematics tries to tackle this knowledge gap by substantiating the neural architecture with theoretical guarantees. Around 30 participants from diverse disciplines gathered to discuss driving questions and key insights that will impact our fundamental understanding of modern deep learning.

The workshop was jointly organized by Martin Burger, Samira Kabri, Tim Roith, Lukas Weigand (Computational Imaging at DESY and Helmholtz Imaging) and Konstantin Riedl (University of Oxford). It was a satellite event to the Conference on Mathematics of Machine Learning, also co-organized by Martin Burger.

Summary

The workshop featured five invited talks given by

Giuseppe Bruno (University of Bern) on “A multiscale analysis of mean-field transformers in the moderate interaction regime”
Valérie Castin (ENS Paris) on “Mean-Field Transformer Dynamics with Gaussian Inputs”
Subhabrata Dutta (TU Darmstadt) on “Transformers as token-to-token function learners”
Borjan Geshkovski (Inria Paris) on “Dynamic metastability in self-attention dynamics”
Michaël E. Sander (Google DeepMind) on “Transformers: From Dynamical Systems to Autoregressive In-Context Learners”

In the afternoon, participants joined a lively world café, sparking discussions that connected theoretical perspectives with practical relevance. Across all contributions, the common ground was provided by the same language: mathematics. Naturally, the discussion included research questions that are of independent mathematical interest.

Outlook

The feedback was overwhelmingly positive. Although the field is still young, the workshop clearly boosted interest in the mathematics of transformers. For many of the participants, it was their first contact with the subject and probably not their last. Given the engagement and general curiosity, a second edition of this format is already on the horizon.

Stay tuned! Subscribe to our newsletter: https://helmholtz-imaging.de/newsletter/