Machine learning approaches for automatic classification of single-particle mass spectrometry data

G. Wang; H. Ruser; J. Schade; J. Schade; J. Passig; J. Passig; J. Passig; T. Adam; T. Adam; G. Dollinger; R. Zimmermann; R. Zimmermann; R. Zimmermann

doi:10.5194/amt-17-299-2024

Atmospheric Measurement Techniques (Jan 2024)

Machine learning approaches for automatic classification of single-particle mass spectrometry data

G. Wang,
H. Ruser,
J. Schade,
J. Schade,
J. Passig,
J. Passig,
J. Passig,
T. Adam,
T. Adam,
G. Dollinger,
R. Zimmermann,
R. Zimmermann,
R. Zimmermann

Affiliations

G. Wang: Department of Aerospace Engineering, Institute for Applied Physics and Measurement Technology, University of the Bundeswehr Munich, 85577 Neubiberg, Germany
H. Ruser: Department of Aerospace Engineering, Institute for Applied Physics and Measurement Technology, University of the Bundeswehr Munich, 85577 Neubiberg, Germany
J. Schade: Department of Mechanical Engineering, Institute of Chemistry and Environmental Engineering, University of the Bundeswehr Munich, 85577 Neubiberg, Germany
J. Schade: Joint Mass Spectrometry Centre, Institute of Chemistry, Division of Analytical and Technical Chemistry, University of Rostock, 18059 Rostock, Germany
J. Passig: Joint Mass Spectrometry Centre, Institute of Chemistry, Division of Analytical and Technical Chemistry, University of Rostock, 18059 Rostock, Germany
J. Passig: Joint Mass Spectrometry Centre, Helmholtz Zentrum München, 85764 Neuherberg, Germany
J. Passig: Department of Life, Light and Matter, Faculty of Interdisciplinary Faculty, University of Rostock, 18059 Rostock, Germany
T. Adam: Department of Mechanical Engineering, Institute of Chemistry and Environmental Engineering, University of the Bundeswehr Munich, 85577 Neubiberg, Germany
T. Adam: Joint Mass Spectrometry Centre, Helmholtz Zentrum München, 85764 Neuherberg, Germany
G. Dollinger: Department of Aerospace Engineering, Institute for Applied Physics and Measurement Technology, University of the Bundeswehr Munich, 85577 Neubiberg, Germany
R. Zimmermann: Joint Mass Spectrometry Centre, Institute of Chemistry, Division of Analytical and Technical Chemistry, University of Rostock, 18059 Rostock, Germany
R. Zimmermann: Joint Mass Spectrometry Centre, Helmholtz Zentrum München, 85764 Neuherberg, Germany
R. Zimmermann: Department of Life, Light and Matter, Faculty of Interdisciplinary Faculty, University of Rostock, 18059 Rostock, Germany

DOI: https://doi.org/10.5194/amt-17-299-2024
Journal volume & issue: Vol. 17
pp. 299 – 313

Abstract

Read online

The chemical composition of aerosol particles is a key parameter for human health and climate effects. Single-particle mass spectrometry (SPMS) has evolved to a mature technology with unique chemical coverage and the capability to analyze the distribution of aerosol components in the particle ensemble in real time. With the fully automated characterization of the chemical profile of the aerosol particles, selective real-time monitoring of air quality could be performed, e.g., for urgent risk assessments due to particularly harmful pollutants. For aerosol particle classification, mostly unsupervised clustering algorithms (ART-2a, K-means and their derivatives) are used, which require manual postprocessing. In this work, we focus on supervised algorithms to tackle the problem of the automatic classification of large amounts of aerosol particle data. Supervised learning requires data with labels to train a predictive model. Therefore, we created a labeled benchmark dataset containing ∼ 24 000 particles with eight different coarse categories that were highly abundant at a measurement in summer in Central Europe: elemental carbon (EC), organic carbon and elemental carbon (OC-EC), potassium-rich (K-rich), calcium-rich (Ca-rich), iron-rich (Fe-rich), vanadium-rich (V-rich), magnesium-rich (Mg-rich) and sodium-rich (Na-rich). Using the chemical features of particles, the performance of the following classical supervised algorithms was tested: K-nearest neighbors, support vector machine, decision tree, random forest and multi-layer perceptron. This work shows that despite the entrenched position of unsupervised clustering algorithms in the field, the use of supervised algorithms has the potential to replace the manual step of clustering algorithms in many applications, where real-time data analysis is essential. For the classification of the eight classes, the prediction accuracy of several supervised algorithms exceeded 97 %. The trained model was used to classify ∼ 49 000 particles from a blind dataset in 0.2 s, taking into account also a class of “unclassified” particles. The predictions are highly consistent with the results obtained in a previous study using ART-2a.

Published in Atmospheric Measurement Techniques

ISSN: 1867-1381 (Print); 1867-8548 (Online)
Publisher: Copernicus Publications
Country of publisher: Germany
LCC subjects: Technology: Engineering (General). Civil engineering (General): Environmental engineering; Technology: Engineering (General). Civil engineering (General): Earthwork. Foundations
Website: http://www.atmospheric-measurement-techniques.net/home.html

About the journal