Extracting structural motifs from pair distribution function data of nanostructures using explainable machine learning

Andy S. Anker; Emil T. S. Kjær; Mikkel Juelsholt; Troels Lindahl Christiansen; Susanne Linn Skjærvø; Mads Ry Vogel Jørgensen; Innokenty Kantor; Daniel Risskov Sørensen; Simon J. L. Billinge; Raghavendra Selvan; Kirsten M. Ø. Jensen

doi:10.1038/s41524-022-00896-3

npj Computational Materials (Oct 2022)

Extracting structural motifs from pair distribution function data of nanostructures using explainable machine learning

Andy S. Anker,
Emil T. S. Kjær,
Mikkel Juelsholt,
Troels Lindahl Christiansen,
Susanne Linn Skjærvø,
Mads Ry Vogel Jørgensen,
Innokenty Kantor,
Daniel Risskov Sørensen,
Simon J. L. Billinge,
Raghavendra Selvan,
Kirsten M. Ø. Jensen

Affiliations

Andy S. Anker: Department of Chemistry and Nano-Science Center, University of Copenhagen
Emil T. S. Kjær: Department of Chemistry and Nano-Science Center, University of Copenhagen
Mikkel Juelsholt: Department of Materials, University of Oxford
Troels Lindahl Christiansen: Department of Chemistry and Nano-Science Center, University of Copenhagen
Susanne Linn Skjærvø: Department of Chemistry and Nano-Science Center, University of Copenhagen
Mads Ry Vogel Jørgensen: Department of Chemistry & iNANO, Aarhus University
Innokenty Kantor: MAX IV Laboratory, Lund University
Daniel Risskov Sørensen: Department of Chemistry & iNANO, Aarhus University
Simon J. L. Billinge: Department of Applied Physics and Applied Mathematics, Columbia University
Raghavendra Selvan: Department of Computer Science, University of Copenhagen
Kirsten M. Ø. Jensen: Department of Chemistry and Nano-Science Center, University of Copenhagen

DOI: https://doi.org/10.1038/s41524-022-00896-3
Journal volume & issue: Vol. 8, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Characterization of material structure with X-ray or neutron scattering using e.g. Pair Distribution Function (PDF) analysis most often rely on refining a structure model against an experimental dataset. However, identifying a suitable model is often a bottleneck. Recently, automated approaches have made it possible to test thousands of models for each dataset, but these methods are computationally expensive and analysing the output, i.e. extracting structural information from the resulting fits in a meaningful way, is challenging. Our Machine Learning based Motif Extractor (ML-MotEx) trains an ML algorithm on thousands of fits, and uses SHAP (SHapley Additive exPlanation) values to identify which model features are important for the fit quality. We use the method for 4 different chemical systems, including disordered nanomaterials and clusters. ML-MotEx opens for a type of modelling where each feature in a model is assigned an importance value for the fit quality based on explainable ML.

Published in npj Computational Materials

ISSN: 2057-3960 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Materials of engineering and construction. Mechanics of materials; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://www.nature.com/npjcompumats/

About the journal