Toward Interpretable Machine Learning Models for Materials Discovery

Paulius Mikulskis; Morgan R. Alexander; David Alan Winkler

doi:10.1002/aisy.201900045

Advanced Intelligent Systems (Dec 2019)

Toward Interpretable Machine Learning Models for Materials Discovery

Paulius Mikulskis,
Morgan R. Alexander,
David Alan Winkler

Affiliations

Paulius Mikulskis: School of Pharmacy University of Nottingham Nottingham NG7 2RD UK
Morgan R. Alexander: School of Pharmacy University of Nottingham Nottingham NG7 2RD UK
David Alan Winkler: School of Pharmacy University of Nottingham Nottingham NG7 2RD UK

DOI: https://doi.org/10.1002/aisy.201900045
Journal volume & issue: Vol. 1, no. 8
pp. n/a – n/a

Abstract

Read online

Machine learning (ML) and artificial intelligence (AI) methods for modeling useful materials properties are now important technologies for rational design and optimization of bespoke functional materials. Although these methods make good predictions of the properties of new materials, current modeling methods use efficient but rather arcane (difficult‐to‐interpret) mathematical features (descriptors) to characterize materials. Data‐driven ML models are considerably more useful if more chemically interpretable descriptors are used to train them, as long as these models also accurately recapitulate the properties of materials in training and test sets used to generate and validate the models. Herein, how a particular type of molecular fragment descriptor, the signature descriptor, achieves these joint aims of accuracy and interpretability is described. Seven different types of materials properties are modeled, and the performance of models generated from signature descriptors is compared with those generated by widely used Dragon descriptors. The key descriptors in the model represent functionalities that make chemical sense. Mapping these fragments back on to exemplar materials provides a useful guide to chemists wishing to modify promising lead materials to improve their properties. This is one of the first applications of signature descriptors to the modeling of complex materials properties.

Published in Advanced Intelligent Systems

ISSN: 2640-4567 (Online)
Publisher: Wiley
Country of publisher: Germany
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Mechanical engineering and machinery: Control engineering systems. Automatic machinery (General)
Website: https://onlinelibrary.wiley.com/journal/26404567

About the journal

Abstract

Keywords