Published on 27.10.2025

Helmholtz launches first UNLOCK benchmarking projects

info

The Helmholtz Association has selected the first projects to be funded under its UNLOCK call for benchmarking datasets, a key component in the Helmholtz agenda to support and improve the comparability of AI models and the quality of results. The call aims to strengthen reproducible and trustworthy AI across Helmholtz by supporting the creation of high-quality, multimodal, and cross-domain benchmarks. With these benchmark datasets, Helmholtz supports a new generation of benchmarking in AI and answers in a structured way to the needs from the different scientific fields.
Through UNLOCK, Helmholtz leverages its vast and diverse data resources to establish a community of practice for AI benchmarking, bringing together domain scientists, AI researchers, and data experts from across the Association. The funded projects will develop open and reusable benchmarking datasets that set new standards for evaluating AI performance in scientific contexts.

“With this initiative, we are strengthening our competence in AI, providing the tools needed to understand complex new models, and unlocking their potential for scientific discovery. Most importantly, we are advancing the Helmholtz mission to tackle the grand challenges of our time through excellent science. This community-driven effort highlights again the dynamic and forward-looking spirit of the Helmholtz Information and Data Science community.” – Prof. Otmar D. Wiestler, President of the Helmholtz Association

Image: Svea Pietschmann, MDC

Each project will receive up to €150,000 from the Helmholtz Initiative and Networking Fund, matched by the same amount from the Centers. The selected teams will collaborate with Helmholtz AI, Helmholtz Imaging, HIFIS and HMC to ensure best practices in metadata handling, reproducible pipelines, and open access.

The projects will start in January 2026 and contribute to a series of workshops to share best practices and connect the emerging benchmarking community.

Meet the first UNLOCK projects

ADD-ON: Adenylation Domain Database and Online Benchmarking Platform

Helmholtz Centers & others: Helmholtz Centre for Infection Research (HZI), Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Saarland University, Myria Biosciences AG

Project leads: Jun.-Prof. Dr. Alexey Gurevich, Helmholtz Centre for Infection Research (HZI), Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Human-Microbe Systems Bioinformatics research group

Data: Paired enzyme-substrate data (genome sequences and chemical structures) from diverse bacteria

Challenge: ADD-ON addresses the lack of reliable data for predicting how microbial enzymes assemble peptide-based natural products. By enabling accurate AI-driven structure prediction, it accelerates the discovery of new bioactive compounds and ultimately supports efforts to combat antimicrobial resistance.

Read the full ADD-ON project profile

AIMBIS – Artificial Intelligence for Microscopic Biodiversity Screening

Helmholtz Centers: Helmholtz Centre for Environmental Research – UFZ, Department Physiological Diversity & Monitoring and Exploration Technologies & Data management, Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research (AWI), Section Polar Terrestrial Environmental Systems & Marine Sustainable Bioeconomy

Project lead: Dr. Susanne Dunker, Helmholtz Centre for Environmental Research – UFZ, Department Physiological Diversity, Research group Imaging Flow Cytometry

Data: Modern and fossil phytoplankton and pollen images derived from air, water, plants, pollinators or sediment samples measured by multispectral imaging cytometers

Challenge: Manual microscopic biodiversity monitoring is time-consuming and requires expert knowledge, limiting the potential for biodiversity monitoring, hence to recognize the impacts of climate and environmental change on crucial ecosystem functions.

Read the full AIMBIS project profile

AMOEBE – lArge-scale Multi-mOdal Microbial livE-cell imaging BEnchmark

Helmholtz Centers: Forschungszentrum Jülich (FZJ), Helmholtz Centre for Environmental Research – UFZ, Karlsruhe Institute of Technology (KIT)

Project lead: Dr. Katharina Nöh, FZJ, Institute of Bio- and Geosciences, IBG-1: Biotechnology, Modeling of Biochemical Networks and Cells

Data: AMOEBE uses high-resolution time-lapse microscopy data of microbial species and communities, covering diverse imaging modalities, cultivation systems, and organisms. The data originate from the microbial life sciences domain and include standardized metadata to ensure FAIR use and cross-domain applicability for AI research.

Challenge: Building a large-scale, FAIR benchmark for AI-driven analysis of microbial communities using time-lapse microscopy to advance understanding of microbial dynamics, ecosystem stability, and their role in health and biotechnology.

Read the full AMOEBE project profile

BASE – Benchmarking Agro-environmental database for Sustainable agriculture intensification

Helmholtz Center: Helmholtz Centre for Environmental Research – UFZ

Project lead: Dr. Shekhar Sharan Goyal, UFZ, Department of Computational Hydro-Systems (CHS)

Data: Spatio-temporally detailed crop-specific agronomic, economical and climatological data from multiple sources, including field surveys, government records, remote sensing, and peer-reviewed literature

Challenge: Building a BASE dataset enables robust predictions of yield potential, resource efficiency, and sustainability thresholds, driving climate resilience and sustainable agricultural intensification

Read the full BASE project profile

Image: Ingmar Nitze, AWI (BSIC 2021 contribution)

Round image of UNLOCK project ForestUNLOCK

ForestUNLOCK: A multi-modal Multiscale Benchmark Dataset for AI-Driven Boreal Forest Monitoring and Carbon Accounting

Helmholtz Centers: Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research (AWI), German Aerospace Center (DLR), GFZ Helmholtz Centre for Geosciences

Project lead: Dr. Stefan Kruse, AWI, Polar Terrestrial Environmental Section

Data: Spatially co-registered and multi-temporal data from terrestrial, airborne, and spaceborne sensors in two and three dimensions

Challenge: Building the first consistent multi-modal single tree benchmark for forest structure and carbon stock assessments of the northern boreal forest

Read the full ForestUNLOCK project profile

GRIDMARK – Generating Reproducible Insights through Data Benchmarking for AI in Energy Systems

Helmholtz Centers: Karlsruhe Institute of Technology (KIT) and Forschungszentrum Jülich (FZJ)

Project leads: TT.-Prof. Dr. Benjamin Schäfer (KIT), Dr. Daniele Carta and Prof. Dr. Andrea Benigni (FZJ)

Data: Time series and tabular data from distribution-level power grids, based on the infrastructures of KIT’s EnergyLab and FZJ’s Living Lab Energy Campus

Challenge: Transforming energy systems toward climate neutrality: Distribution grids have the potential to be catalysts for the energy transition. Unfortunately, most Distribution System Operators lack the resources to fully monitor their systems. Therefore, there is an urgent need for more high-quality data, particularly to develop and test machine learning models.

Read the full GRIDMARK project profile

Round image of UNLOCK project NeuroHarmonize

NeuroHarmonize – A Benchmark Decentralized Data Harmonization Workflow for AI-Driven Alzheimer’s Disease Management

Helmholtz Center: Karlsruhe Institute of Technology (KIT)

Project lead: Dr. Bahar Dadfar

Data: Multimodal Alzheimer’s disease datasets spanning imaging modalities (f. ex. MRI, PET, CT, EEG), cognitive scores (f. ex. MMSE, ADAS-Cog), genetic profiles (f. ex. SNPS, APOE status), and clinical records. Data are sourced from public cohorts such as ADNI, AIBL, and OASIS, as well as a clinical partner hospital.

Challenge: The benchmark addresses the lack of harmonized, reproducible, and privacy-preserving multimodal datasets for Alzheimer’s disease (AD). Current AI models struggle with fragmented and non-standardized data, which limits their generalizability and clinical deployment. NeuroHarmonize creates a FAIR-compliant, decentralized benchmarking framework to accelerate reliable, transparent, and collaborative AI for AD diagnosis, prognosis, and long-term monitoring.

Read the full NeuroHarmonize project profile

Image: Georg Kislinger, Martina Schifferer, Christian Haass & Maryam Khojasteh-Farat, DZNE (BSIC 2021 contribution)

Pero – Unlocking ML Potential: Benchmark Datasets on Perovskite Thin Film Processing

Helmholtz Centers & others: Karlsruhe Institute of Technology (KIT), Institute of Microstructure Technology, Next Generation Photovoltaics, Helmholtz-Zentrum Berlin für Materialien und Energy GmbH, Department Solution Processing of Hybrid Materials & Devices, Karlsruhe Institute of Technology (KIT), Scientific Computing Center, Helmholtz AI Consultants Energy

Project lead: Prof. Dr. Ulrich W. Paetzold, KIT, Institute of Microstructure Technology

Data: Time-resolved photoluminescence imaging and video data from perovskite thin-film fabrication, with corresponding photovoltaic device performance data

Challenge: Addressing the lack of standardized, FAIR benchmark datasets in perovskite photovoltaics. Pero enables reproducible AI models for efficiency prediction, material classification, and defect detection, which are critical for industrial scaling of sustainable energy technologies.

Read the full Pero project profile

Round image of UNLOCK project RenewBench

RenewBench – A Global Benchmark for Renewable Energy Generation

Helmholtz Centers: Karlsruhe Institute for Technology (KIT) & Hereon

Project lead: Kaleb Phipps, KIT

Data: Combination of open source renewable energy data for solar, wind, and hydropower from Europe (ENTSO-E Transparency Platform), USA (U.S. Energy Information Administration), Australia (Australian Energy Market Operator) with global meteorological data based on the ECMWF Reanalysis v5 dataset (ERA5), ECMWF Integrated Forecast System, ICON-DREAM reanalysis data set from German Weather Service, and the High-Resolution Rapid Refresh data from the National Oceanic & Atmospheric Administration

Challenge: Transitioning to a sustainable energy system is challenging due to the inherently variable and decentralized nature of renewable energy generation, making it difficult to ensure grid reliability, coordinate energy dispatch, and maintain system stability. AI energy meteorology models, that couple the meteorological and energy domains, show huge potential for solving these challenges. However their development is currently limited by a lack of high-quality, standardized, and representative data. Therefore, RenewBench aims to lower barriers for such model development, and increases trust in these AI models by enabling transparent comparisons, ultimately accelerating the global shift to a sustainable energy system.

Read the full RenewBench project profile

SCHEMA – profiling Spatial Cancer HEterogeneity across modalities to benchmark Metastasis risk prediction

Helmholtz Centers & others: Helmholtz Munich (HM), Karlsruhe Institute of Technology (KIT), University Сlinic Tübingen, Institute of Pathology, Lamin.AI, Data Intuitive

Project leads: Dr. Malte Lücken, HM, Institute of Computational Biology, Institute of Lung Health & Immunity, Prof. Dr. Markus Diefenbacher, HM, Institute of Lung Health & Immunity

Data: Spatial transcriptomics and spatial proteomics profiles from patient derived cancer samples (lung, colon, and breast), publicly available and in-house generated

Challenge: Metastases represent a significant exacerbation of tumor severity. If one could predict the likelihood of tumors metastasizing, this could inform treatment decisions to avoid or delay this outcome. SCHEMA develops a benchmark dataset of primary tumor samples and metadata on whether the tumor has metastasized at different time points after sampling. With this dataset, a challenge for machine learning scientists will be defined to build prognostic models for likelihood of tumors metastasizing, promoting innovation in prognostic modeling for a clinically relevant task.

Read the full SCHEMA project profile

Image: Hellmut Augustin, DKFZ (BSIC 2021 contribution)

TIMELY: Time-series Integration across Modalities for Evaluation of Latent DYnamics

Helmholtz Centers: Helmholtz Munich (HM), German Center for Neurodegenerative Diseases (DZNE), Stanford University

Project lead: Dr. Steffen Schneider, HM

Data: Multimodal time-series data from:

Neuroscience – Cortical activity under anesthesia, Functional recordings from brain organoids/monolayer cultures, Calcium imaging from brain–heart interaction in zebrafish
Cell and molecular biology – Yeast live-cell imaging dataset, Longitudinal transcriptomics (NTVE)
Behavior – Mouse behavior and annotated behavioral motifs, open field behavioral data from the German Mouse Clinic
Ecology and biodiversity – Various ecological time-series

Challenge: TIMELY provides the first comprehensive benchmark for multimodal biological time-series data, addressing the lack of standardized, high-quality datasets for modeling complex dynamical systems. It fosters the development of statistical and foundation models tailored to the analytical needs of research in biomedicine and neuroscience.

Read the full TIMELY project profile

UQOB – Uncertainty Quantification in Object-detection Benchmark

Helmholtz Centers: Helmholtz Munich (HM), German Cancer Research Center (DKFZ)

Project lead: Dr. Marie Piraud, HM, Head of the consultant team, Helmholtz AI

Data: High-resolution 2D microscopy images of organoids from human and murine lung and colon tissues, annotated with bounding boxes and class labels

Challenge: Creating a benchmark dataset for object-detection and Uncertainty Quantification (UQ) in a multi-rater setting, to address annotation variability and AI model evaluation.

Read the full UQOB project profile

Together, these projects mark the first step in building a Helmholtz-wide ecosystem for benchmarking, an important foundation for trustworthy and reproducible AI that will benefit science, industry, and society alike.

See all UNLOCK projects

Learn more about the call