Nature Communications (Dec 2023)

Language models and protocol standardization guidelines for accelerating synthesis planning in heterogeneous catalysis

  • Manu Suvarna,
  • Alain Claude Vaucher,
  • Sharon Mitchell,
  • Teodoro Laino,
  • Javier Pérez-Ramírez

DOI
https://doi.org/10.1038/s41467-023-43836-5
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Synthesis protocol exploration is paramount in catalyst discovery, yet keeping pace with rapid literature advances is increasingly time intensive. Automated synthesis protocol analysis is attractive for swiftly identifying opportunities and informing predictive models, however such applications in heterogeneous catalysis remain limited. In this proof-of-concept, we introduce a transformer model for this task, exemplified using single-atom heterogeneous catalysts (SACs), a rapidly expanding catalyst family. Our model adeptly converts SAC protocols into action sequences, and we use this output to facilitate statistical inference of their synthesis trends and applications, potentially expediting literature review and analysis. We demonstrate the model’s adaptability across distinct heterogeneous catalyst families, underscoring its versatility. Finally, our study highlights a critical issue: the lack of standardization in reporting protocols hampers machine-reading capabilities. Embracing digital advances in catalysis demands a shift in data reporting norms, and to this end, we offer guidelines for writing protocols, significantly improving machine-readability. We release our model as an open-source web application, inviting a fresh approach to accelerate heterogeneous catalysis synthesis planning.