Journal of Cheminformatics (Mar 2024)

Advancing material property prediction: using physics-informed machine learning models for viscosity

  • Alex K. Chew,
  • Matthew Sender,
  • Zachary Kaplan,
  • Anand Chandrasekaran,
  • Jackson Chief Elk,
  • Andrea R. Browning,
  • H. Shaun Kwak,
  • Mathew D. Halls,
  • Mohammad Atif Faiz Afzal

DOI
https://doi.org/10.1186/s13321-024-00820-5
Journal volume & issue
Vol. 16, no. 1
pp. 1 – 14

Abstract

Read online

Abstract In materials science, accurately computing properties like viscosity, melting point, and glass transition temperatures solely through physics-based models is challenging. Data-driven machine learning (ML) also poses challenges in constructing ML models, especially in the material science domain where data is limited. To address this, we integrate physics-informed descriptors from molecular dynamics (MD) simulations to enhance the accuracy and interpretability of ML models. Our current study focuses on accurately predicting viscosity in liquid systems using MD descriptors. In this work, we curated a comprehensive dataset of over 4000 small organic molecules’ viscosities from scientific literature, publications, and online databases. This dataset enabled us to develop quantitative structure–property relationships (QSPR) consisting of descriptor-based and graph neural network models to predict temperature-dependent viscosities for a wide range of viscosities. The QSPR models reveal that including MD descriptors improves the prediction of experimental viscosities, particularly at the small data set scale of fewer than a thousand data points. Furthermore, feature importance tools reveal that intermolecular interactions captured by MD descriptors are most important for viscosity predictions. Finally, the QSPR models can accurately capture the inverse relationship between viscosity and temperature for six battery-relevant solvents, some of which were not included in the original data set. Our research highlights the effectiveness of incorporating MD descriptors into QSPR models, which leads to improved accuracy for properties that are difficult to predict when using physics-based models alone or when limited data is available. Graphical Abstract

Keywords