NeuroHarmonize – A Benchmark Decentralized Data Harmonization Workflow for AI-Driven Alzheimer’s Disease Management

Visual for NeuroHarmonize; The benchmark addresses the lack of harmonized, reproducible, and privacy-preserving multimodal datasets for Alzheimer’s disease (AD). Current AI models struggle with fragmented and non-standardized data, which limits their generalizability and clinical deployment. NeuroHarmonize creates a FAIR-compliant, decentralized benchmarking framework to accelerate reliable, transparent, and collaborative AI for AD diagnosis, prognosis, and long-term monitoring.

Image: Georg Kislinger, Martina Schifferer, Christian Haass & Maryam Khojasteh-Farat, DZNE (BSIC 2021 contribution)

What is the project about?

NeuroHarmonize develops a FAIR, multimodal benchmark for Alzheimer’s disease, integrating imaging, genetics, cognitive, and clinical data. Using federated learning and blockchain governance, it standardizes evaluation tasks, supports reproducibility, and fosters collaboration to enable clinically deployable AI decision-support systems.

What main scientific or societal challenge does the benchmark address?

The benchmark addresses the lack of harmonized, reproducible, and privacy-preserving multimodal datasets for Alzheimer’s disease (AD). Current AI models struggle with fragmented and non-standardized data, which limits their generalizability and clinical deployment. NeuroHarmonize creates a FAIR-compliant, decentralized benchmarking framework to accelerate reliable, transparent, and collaborative AI for AD diagnosis, prognosis, and long-term monitoring.

What motivated you to apply for UNLOCK, and how does the project align with the initiative’s vision?

We applied to hyperUNLOCK to advance reproducibility and accessibility in clinical AI, goals central to its mission. NeuroHarmonize aligns by delivering open, a FAIR-compliant benchmark, enabling transparent model comparison, and creating decentralized infrastructures that accelerate scalable, trustworthy solutions for clinical AI deployment.

How does the benchmark dataset support reproducibility, robustness, and fairness in AI research?

The dataset will be FAIR-compliant, BIDS- and OpenML-standardized, and include blockchain-governed provenance tracking. Robustness is ensured through cluster-specific harmonization, fusion-based modeling, and evaluation under missing modalities. Fairness is addressed via demographic subgroup analysis, bias mitigation metrics, and transparent leaderboards, promoting trustworthy AI practices.

What is the project’s structure — from data curation to expected outputs such as publications or competitions?

The project has four work packages:

secure multimodal data collection and blockchain-based governance;
harmonization and preprocessing with clustering/optimization pipelines;
design of benchmarking tasks and training of baseline AI models; and
development of fusion architectures, public release, and leaderboard deployment.

Outputs include an open dataset, a preprocessing toolkit, AI baselines, publications, and international benchmarking competitions.

Other projects

Visual for ForestUNLOCK; Building the first consistent multi-modal single tree benchmark for forest structure and carbon stock assessments of the northern boreal forest

Image: Open white spruce forest with glacier in background in the Chugach Mountains, Alaska, US ©Stefan Kruse, AWI

ForestUNLOCK: A multi-modal Multiscale Benchmark Dataset for AI-Driven Boreal Forest Monitoring and Carbon Accounting

Building the first consistent multi-modal single tree benchmark for forest structure and carbon stock assessments of the northern boreal forest

Image: Ingmar Nitze, AWI (BSIC 2021 contribution)

BASE: Benchmarking Agro-environmental database for Sustainable agriculture intensification

Building a BASE dataset enables robust predictions of yield potential, resource efficiency, and sustainability thresholds, driving climate resilience and sustainable agricultural intensification

Visual for GRIDMARK; Transforming energy systems toward climate neutrality: Distribution grids have the potential to be catalysts for the energy transition. Unfortunately, most Distribution System Operators lack the resources to fully monitor their systems. Therefore, there is an urgent need for more high-quality data, particularly to develop and test machine learning models.

GRIDMARK – Generating Reproducible Insights through Data Benchmarking for AI in Energy Systems

Transforming energy systems toward climate neutrality: Distribution grids have the potential to be catalysts for the energy transition. Unfortunately, most Distribution System Operators lack the resources to fully monitor their systems. Therefore, there is an urgent need for more high-quality data, particularly to develop and test machine learning models.

NeuroHarmonize – A Benchmark Decentralized Data Harmonization Workflow for AI-Driven Alzheimer’s Disease Management

Project partners

Primary Contact

Other projects

ForestUNLOCK: A multi-modal Multiscale Benchmark Dataset for AI-Driven Boreal Forest Monitoring and Carbon Accounting

BASE: Benchmarking Agro-environmental database for Sustainable agriculture intensification

GRIDMARK – Generating Reproducible Insights through Data Benchmarking for AI in Energy Systems