Published on 21.08.2023
During the last decade, most European Photon and Neutron (PaN) facilities have adopted open data policies, making data available for the benefit of the entire scientific community. At the same time, machine learning (ML) is seen as an essential tool to address the exponential growth of data volumes from PaN facilities.
Exploitation of experimental training datasets is a key component of machine learning. The combination of ML algorithms and open data can therefore be seen as an ideal marriage that would ultimately help the entire community to tackle ‘big data’ challenges with more automation.
However, finding the right data to train machine learning algorithms is a challenge and one of the motivations for making data FAIR is exactly that: to provide scientists working on AI applications with quality training datasets.
But what does ‘quality’ mean to PaN science communities? What metadata fields are needed to find the data, to understand if it is suitable for our research, and ultimately to be able to ingest it in our training models? How can we provide sufficiently rich metadata? What would be the enablers for more machine learning applications? How can we improve the collaboration between data producers (domain scientists) and data consumers (ML experts)?
With this workshop, we aim to discuss these questions, among staff and users of the LEAPS and LENS facilities, across disciplines and across Europe.
We will present projects and teams that have successfully used open datasets from PaN facilities to train their specific ML application (data consumers), as well as domain scientists (data producers) who have published curated data specifically for ML applications.
We will also look at cases where it hasn’t worked so well, to identify what needs to be better curated on the FAIR data management side or understand the challenges in finding ML experts to effectively utilise the available data. A significant part of the workshop will be dedicated to discussion.
During the workshop, we have slots available for 20 minute presentations. We are particularly interested in contributions that can address the following points:
The workshop will be held at the SOLEIL synchrotron, Saint Aubin (France) as a satellite event of the LEAPS General Assembly. Please note there is a 50€ registration fee for on-site participants. Remote participation will be possible too and is free of charge.
Starting time: 17 October 2023, 12pm CEST
End time: 18 October 2023, 2pm CEST
Location: Synchrotron SOLEIL – CNRS – CEA Paris-Saclay, L’Orme des Merisiers Départementale 128, 91190 Saint-Aubin