Scientific Data (Jan 2024)

Automatically Generated Datasets: Present and Potential Self-Cleaning Coating Materials

  • Shaozhou Wang,
  • Yuwei Wan,
  • Ning Song,
  • Yixuan Liu,
  • Tong Xie,
  • Bram Hoex

DOI
https://doi.org/10.1038/s41597-024-02983-0
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 10

Abstract

Read online

Abstract The rise of urbanization coupled with pollution has highlighted the importance of outdoor self-cleaning coatings. These revolutionary coatings contribute to the longevity of various surfaces and reduce maintenance costs for a wide range of applications. Despite ongoing research to develop efficient and durable self-cleaning coatings, adopting systematic research methodologies could accelerate these advancements. In this work, we use Natural Language Processing (NLP) strategies to generate open- and traceable-sourced datasets about self-cleaning coating materials from 39,011 multi-disciplinary papers. The data are from function-based and property-based corpora for self-cleaning purposes. These datasets are presented in four different formats for diverse uses or combined uses: material frequency statistics, material dictionary, measurement value datasets for self-cleaning-related properties and optical properties, and sentiment statistics of material stability and durability. This provides a literature-based data resource for the development of self-cleaning coatings and also offers potential pathways for material discovery and prediction by machine learning.