Science and Technology of Advanced Materials: Methods (Jan 2021)
Prediction of the coefficient of linear thermal expansion for the amorphous homopolymers based on chemical structure using machine learning
Abstract
The coefficient of thermal expansion (CTE) is an industrially crucial macroscopic property of polymers. Yet, there is no structure-based model expressing it with sufficient accuracy. In this work, we present two data-driven predictive models for the linear CTE of amorphous homopolymers in the glassy state based solely on chemical structure, showing consistent predictions. The first model is built with the SMILES-X software and is based on the simplified molecular-input line-entry system (SMILES) of polymer’s repeating unit as input. The second model is built with a random forest trained on extended-connectivity fingerprints of repeating units. Both models are trained on 106 experimental data samples taken from the PoLyInfo database. The out-of-sample prediction shows a root-mean-square error of 2.65 ± 0.09 × 10–5 K–1 (2.58 ± 0.09 × 10–5 K–1), a mean absolute error of 1.71 ± 0.06 × 10–5 K–1 (1.61 ± 0.06 × 10–5 K–1) and a coefficient of determination of 0.62 ± 0.03 (0.64 ± 0.03) for SMILES-X (random forest). Additionally, the models are validated experimentally using a lab-prepared sample with good agreement (p-value$$ \gg $$for both models). The attention mechanism, incorporated into SMILES-X, points out salient SMILES substructures, and the resulting maps suggest that the model takes decisions on a chemically interpretable basis. Abbreviations: SMILES; CTE; CLTE; CVTE
Keywords