Published on 26.11.2024
Reproducibility is a cornerstone of the open science movement, enhancing scientific integrity and bolstering public confidence in research outcomes. Helmholtz plays a pivotal role in advocating for reproducible and robust research on both national and international fronts.
Building on the success of the previous online workshops, “Enabling Reproducibility in Data Science” and “Love Your Data? Make It Reproducible!”, as well as the inaugural on-site Helmholtz Reproducibility Workshop at GFZ in Potsdam, Helmholtz is thrilled to reunite the Helmholtz community in person at the Max Delbrück Center | MDC-BIMSB.
This fourth workshop, co-organized by the Helmholtz Open Science Office and the Helmholtz Information & Data Science Academy (HIDA) in collaboration with local experts at the Max Delbrück Center, will explore the essence of reproducible science and its transformative potential. This year, the focus will be on reproducibility within the Helmholtz research field “Health,” reflecting the expertise of our host center. Nonetheless, the workshop will also address broader aspects of reproducibility, ensuring its relevance to researchers across all fields. Through this event, the future of reproducible, robust, and transparent research within the Helmholtz community is to be shaped.
The workshop will kick off with two keynotes on reproducibility, delivered by Altuna Akalin (MDC) and Frieder Paulus (Lübeck University), with both sessions being livestreamed for accessibility to all Helmholtz Centers. After the keynotes, participants can engage in one of two hands-on workshops designed to provide practical tools and insights for integrating reproducibility into their scientific practices.
In the era of AI, ensuring software reproducibility for data processing and machine learning workflows is more critical than ever, particularly in computationally intensive fields like bioinformatics. Reproducibility guarantees consistent outputs across different environments, which is essential for validating research findings and disseminating workflows widely. However, the complexity of managing software dependencies, especially as versions rapidly evolve, poses significant challenges.
This talk presents a principled approach to building reproducible analysis pipelines using tools like GNU Guix, demonstrated through the PiGx pipelines for RNA sequencing and other bioinformatics applications. We explore how these pipelines, by encapsulating dependencies and providing standardized outputs, serve as a model for reproducibility in AI-driven workflows. Additionally, we discuss the implications of integrating AI and machine learning in research, emphasizing the need for reproducible practices to maintain the integrity of AI applications across various scientific domains.
Speaker: Dr. Altuna Akalin (Max-Delbrück-Center (MDC))
Academics are under considerable pressure when it comes to managing their visibility and reputation as their careers develop. Similar to non-academic work, their competition to coexist is shaped by the unique incentive structures of their work environment. In this talk, I will argue for the importance of envisioning reproducibility and its “crisis” in the context of an academic system that values visibility through publications and confuses measures of journal rank with quality and those of good scientific practice. In addition, I will draw attention to currently discussed novel ideas for the distribution of third-party funding that may have the potential to reduce discrimination against marginalized groups and lower costs in the academic workforce.
Speaker: Dr. Frieder Paulus (Lübeck University)
Reproducibility is often seen as an important cornerstone for the advancement of science. But what does it actually mean and how does it affect your daily work? In this workshop, we will explore different types of reproducibility, strategies to improve it, and also some probabilistic thoughts on why we will never achieve full reproducibility.
Speaker: Dr. Ulf Toelch (BIH QUEST Centre for Responsible Research @ Charité Universitätsmedizin Berlin)
In this workshop on image prevalidation, we will introduce the importance of ensuring that image data is suitable for the analysis it is intended for. Images can contain hidden quality issues, inconsistencies, or artifacts that can impact analysis results, often in subtle ways. Through real-world examples, we will explore how to detect these challenges early on, discussing image quality and consistency across data collections. We aim to create an open, interactive environment where participants can share their experiences and challenges with image data. We will introduce practical approaches to assessing image data before analysis begins, while encouraging discussion on best practices and common pitfalls. Our goal is to foster communication and mutual learning, helping everyone to improve their image validation workflows.
Speaker: Deborah Schmidt / Helmholtz Imaging Engineering & Support Unit MDC
The event will conclude with a shared lunch, fostering networking opportunities among attendees.
The Helmholtz Open Science Office supports the Helmholtz Association as a service provider in shaping the cultural change towards open science. It represents Helmholtz in various open science initiatives, is involved in third-party funded projects, and in this way communicates the Helmholtz positions on open science on a national and international level.
HIDA – the Helmholtz Information & Data Science Academy – is Germany’s largest postgraduate training network in the field of information and data science, preparing the next generation of scientists for a data-intensive future of research.
This year, HIDA and the Open Science Office are supported by colleagues from this year’s host center, namely the Research Data Management Unit at Max Delbrück Center, who provide information, consultation, support, and training to researchers through all phases of the research data life cycle at MDC.