Validating neural networks for spectroscopic classification on a universal synthetic dataset

Jan Schuetzke; Nathan J. Szymanski; Markus Reischl

doi:10.1038/s41524-023-01055-y

npj Computational Materials (Jun 2023)

Validating neural networks for spectroscopic classification on a universal synthetic dataset

Jan Schuetzke,
Nathan J. Szymanski,
Markus Reischl

Affiliations

Jan Schuetzke: Institute for Automation and Applied Informatics, Karlsruhe Institute of Technology
Nathan J. Szymanski: Department of Materials Science & Engineering, Lawrence Berkeley National Laboratory
Markus Reischl: Institute for Automation and Applied Informatics, Karlsruhe Institute of Technology

DOI: https://doi.org/10.1038/s41524-023-01055-y
Journal volume & issue: Vol. 9, no. 1
pp. 1 – 12

Abstract

Read online

Abstract To aid the development of machine learning models for automated spectroscopic data classification, we created a universal synthetic dataset for the validation of their performance. The dataset mimics the characteristic appearance of experimental measurements from techniques such as X-ray diffraction, nuclear magnetic resonance, and Raman spectroscopy among others. We applied eight neural network architectures to classify artificial spectra, evaluating their ability to handle common experimental artifacts. While all models achieved over 98% accuracy on the synthetic dataset, misclassifications occurred when spectra had overlapping peaks or intensities. We found that non-linear activation functions, specifically ReLU in the fully-connected layers, were crucial for distinguishing between these classes, while adding more sophisticated components, such as residual blocks or normalization layers, provided no performance benefit. Based on these findings, we summarize key design principles for neural networks in spectroscopic data classification and publicly share all scripts used in this study.

Published in npj Computational Materials

ISSN: 2057-3960 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Materials of engineering and construction. Mechanics of materials; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://www.nature.com/npjcompumats/

About the journal