npj Computational Materials (Jun 2024)
Predicting polymerization reactions via transfer learning using chemical language models
Abstract
Abstract Polymers are candidate materials for a wide range of sustainability applications such as carbon capture and energy storage. However, computational polymer discovery lacks automated analysis of reaction pathways and stability assessment through retro-synthesis. Here, we report an extension of transformer-based language models to polymerization for both reaction and retrosynthesis tasks. To that end, we have curated a polymerization dataset for vinyl polymers covering reactions and retrosynthesis for representative homo-polymers and co-polymers. Overall, we obtain a forward model Top-4 accuracy of 80% and a backward model Top-4 accuracy of 60%. We further analyze the model performance with representative polymerization examples and evaluate its prediction quality from a materials science perspective. To enable validation and reuse, we have made our models and data available in public repositories.