Published on 11.09.2025

Mathematics of Transformers

This workshop will focus on the transformer architecture and its underlying (self-)attention mechanisms that gained substantial interest in recent years. Despite their empirical success and groundbreaking advances in natural language processing, computer vision, and scientific computing, the mathematical understanding of transformers is still in its infancy, with many fundamental questions only starting to be posed and addressed.

We aim to bring together researchers with backgrounds in multi-agent dynamics, optimal transport, and PDEs, to initiate discussions on a variety of aspects connected to the theoretical principles governing transformers. By fostering discussions, we seek to advance this young and rapidly evolving research field, uncovering new mathematical perspectives on transformer models.

Confirmed speakers

Giuseppe Bruno (University of Bern)
Valérie Castin (ENS Paris)
Subhabrata Dutta (TU Darmstadt)
Borjan Geshkovski (Inria Paris)
Michaël E. Sander (Google DeepMind)

This is a satellite event to the Conference on Mathematics of Machine Learning 2025 that takes place at TUHH from September 22nd-25th 2025.

Organizers

Martin Burger (Helmholtz Imaging, DESY and University of Hamburg)
Samira Kabri (Helmholtz Imaging, DESY)
Konstantin Riedl (University of Oxford)
Tim Roith (Helmholtz Imaging, DESY)
Lukas Weigand (Helmholtz Imaging, DESY)

Read a short summary about the workshop here.