Data Science Journal (Aug 2022)

Machine Learning Applied for Spectra Classification in X-ray Free Electorn Laser Sciences

  • Yue Sun,
  • Sandor Brockhauser

DOI
https://doi.org/10.5334/dsj-2022-015
Journal volume & issue
Vol. 21, no. 1

Abstract

Read online

Spectroscopy experiment techniques are widely used and produce a huge amount of data especially in facilities with very high repetition rates. At the European XFEL, X-ray pulses can be generated with only 220ns separation in time and a maximum of 27000 pulses per second. In experiments at the different scientific instruments, spectral changes can indicate the change of the system under investigation and so the progress of the experiment. Immediate feedback on the actual state (e.g. time-resolved status of the sample) would be essential to quickly judge how to proceed with the experiment. Hence, we aim to capture two major spectral changes. These are the change of intensity distribution (e.g. drop or appearance) of peaks at certain locations, and the shift of the peaks in the spectrum. Machine Learning (ML) opens up new avenues for data-driven analysis in spectroscopy by offering the possibility for quickly recognizing such specific changes and implementing an online feedback system which can be used near real-time during data collection. On the other hand, ML requires lots of data that are clearly annotated. Hence, it is important that experimental data should be managed along the FAIR principles. In the case of XFEL experiments, we suggest introducing NeXus glossary and the corresponding data format standards for future experiments. An example is presented to demonstrate how Neural Network-based ML can be used for accurately classifying the state of an experiment if properly annotated data is provided.

Keywords