Scientific Data (Nov 2024)

A Database of Stress-Strain Properties Auto-generated from the Scientific Literature using ChemDataExtractor

  • Pankaj Kumar,
  • Saurabh Kabra,
  • Jacqueline M. Cole

DOI
https://doi.org/10.1038/s41597-024-03979-6
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 14

Abstract

Read online

Abstract There has been an ongoing need for information-rich databases in the mechanical-engineering domain to aid in data-driven materials science. To address the lack of suitable property databases, this study employs the latest version of the chemistry-aware natural-language-processing (NLP) toolkit, ChemDataExtractor, to automatically curate a comprehensive materials database of key stress-strain properties. The database contains information about materials and their cognate properties: ultimate tensile strength, yield strength, fracture strength, Young’s modulus, and ductility values. 720,308 data records were extracted from the scientific literature and organized into machine-readable databases formats. The extracted data have an overall precision, recall and F-score of 82.03%, 92.13% and 86.79%, respectively. The resulting database has been made publicly available, aiming to facilitate data-driven research and accelerate advancements within the mechanical-engineering domain.