iScience (Oct 2024)

Enhancing chemical synthesis research with NLP: Word embeddings for chemical reagent identification—A case study on nano-FeCu

  • Dingding Cao,
  • Mieow Kee Chan

Journal volume & issue
Vol. 27, no. 10
p. 110780

Abstract

Read online

Summary: Nanoparticle synthesis is complex, influenced by multiple variables including reagent selection. This study introduces a specialized corpus focused on “Fe, Cu, synthesis” to train a domain-specific word embedding model using natural language processing (NLP) in an unsupervised environment. Evaluation metrics included average cosine similarity, visual analysis via t-distributed stochastic neighbor embedding (t-SNE), synonym analysis, and analogy reasoning analysis. Results indicate a strong correlation between learning rate and cosine similarity, with enhanced chemical specificity in the tailored model compared to general models. The framework facilitates rapid identification of potential reagents for nano-FeCu synthesis, enhancing precision in nanomaterial research. This innovative approach offers a data-driven pathway for chemical material synthesis, demonstrating significant interdisciplinary applications.

Keywords