npj Computational Materials (Jun 2024)

Predicting polymerization reactions via transfer learning using chemical language models

  • Brenda S. Ferrari,
  • Matteo Manica,
  • Ronaldo Giro,
  • Teodoro Laino,
  • Mathias B. Steiner

DOI
https://doi.org/10.1038/s41524-024-01304-8
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Polymers are candidate materials for a wide range of sustainability applications such as carbon capture and energy storage. However, computational polymer discovery lacks automated analysis of reaction pathways and stability assessment through retro-synthesis. Here, we report an extension of transformer-based language models to polymerization for both reaction and retrosynthesis tasks. To that end, we have curated a polymerization dataset for vinyl polymers covering reactions and retrosynthesis for representative homo-polymers and co-polymers. Overall, we obtain a forward model Top-4 accuracy of 80% and a backward model Top-4 accuracy of 60%. We further analyze the model performance with representative polymerization examples and evaluate its prediction quality from a materials science perspective. To enable validation and reuse, we have made our models and data available in public repositories.