SoftwareX (May 2024)
MatNexus: A comprehensive text mining and analysis suite for materials discovery
Abstract
In the evolving landscape of materials science, the exponential growth in scientific publications presents both an opportunity and a challenge. Efficiently extracting valuable insights from this vast volume of literature requires specialized tools that go beyond traditional methods. Here, we introduce MatNexus, a software package designed for the automated collection, processing, and analysis of text from scientific articles in the realm of materials science. MatNexus stands out with its integrated suite of modules, which facilitates the retrieval of scientific articles, processes textual data to uncover latent knowledge, generation of vector representations suitable for machine learning applications, and offers advanced visualization capabilities for these word embeddings. Our tool addresses the critical need for effective and reproducable text mining in materials science, an area marked by increasing complexity and data intensity. By making the exploration of materials more efficient and insightful, as exemplified in our case study on electrocatalysts, MatNexus represents a significant advancement in the field. It offers an end-to-end solution for harnessing the wealth of information available in scientific literature, thus aiding the discovery and innovation process in materials science.