MEDAL

A side profile of a human head formed by glowing blue lines and circuit-like patterns, suggesting a digital or artificial intelligence representation. The face appears semi-transparent and geometric, with interconnected nodes and pathways extending outward. Binary code is faintly visible on the left side, while a blurred laptop sits in the background, reinforcing a technology-focused setting.
Image: Envato Elements | info

Data is the driving force behind meaningful progress in research, particularly in the domain of artificial intelligence (AI) for medical imaging. The provision of high-quality, relevant datasets can be used to strategically channel international scientific and technical resources toward solving the most pressing and impactful clinical problems. However, existing benchmarks often fall short by focusing on narrow, repetitive, and clinically irrelevant tasks. To address this critical gap, the Medical Imaging AGI’s Last Exam (MEDAL) initiative introduces a paradigm-shifting, top-down approach to benchmark design. MEDAL aims to create the world’s most challenging and clinically impactful benchmark – a “final exam” that artificial general intelligence (AGI) systems in medical imaging must pass to demonstrate transformative clinical utility.

Through a global crowdsourcing campaign, MEDAL will collect high-stakes, real-world clinical questions and multimodal data that reflect the complexity and variability of everyday medical practice. A multidisciplinary, international consortium, which comprises leaders from diverse fields including AI, clinical medicine, epidemiology, global health, biomedical ethics, and regulatory science, will guide its design and implementation to ensure alignment with real clinical needs. The project tackles four core limitations of existing benchmarks: low diversity (addressed via global crowdsourcing), limited clinical relevance (countered by a top-down approach driven by diverse stakeholders), data sparsity (mitigated through incentivized contributions), and benchmark saturation (countered by using unpublished datasets). The result will be a trustworthy benchmarking framework that sets a new gold standard for evaluating and advancing AI in healthcare, steering the field away from trivial exercises toward solving the most pressing challenges in medical imaging.

Visual to illustrate topic
Annika Reinke, Helmholtz Imaging
Loading...

Other projects


Visual for BestMeta
info

BestMeta

Behavioral Standard Metadata

Developing metadata standards and FAIR analysis pipelines for Video Tracking Assays (VTAs) in toxicology and medical sciences
Two people standing in a computer center; COMFORT logo is integrated in the image
Image: Tim Roith, DESY | info

COMFORT

COMFORT aims to achieve breakthroughs in developing compact, flexible, and robust machine learning models for image, audio, and network data. In doing so, its application-oriented research program will advance the mathematical understanding of machine learning at the intersection of effectiveness and robustness.
GLAM, third-party funded project, Helmholtz Imaging
info

GLAM: Generative lung architecture modeling

This project is developing generative methods for designing bio-printable lung tissues across a spectrum of disease severity in the specific context of mouse and human lung disease.