Data in Brief (Jun 2024)

NB-TCM-CHM: Image dataset of the Chinese herbal medicine fruits and its application in classification through deep learning

  • Dingcheng Tian,
  • Cui Zhou,
  • Yu Wang,
  • Ruyi Zhang,
  • Yudong Yao

Journal volume & issue
Vol. 54
p. 110405

Abstract

Read online

Chinese herbal medicine (CHM) is integral to a traditional Chinese medicine (TCM) system. Accurately identifying Chinese herbal medicine is crucial for quality control and prescription compounding verification. However, with many Chinese herbal medicines and some with similar appearances but different therapeutic effects, achieving precise identification is a challenging task. Traditional manual identification methods have certain limitations, including labor-intensive, inefficient. Deep learning techniques for Chinese herbal medicine identification can enhance accuracy, improve efficiency and lower coats. However, few high-quality Chinese herbal medicine datasets are currently available for deep learning applications. To alleviate this problem, this study constructed a dataset (Dataset 1) containing 3,384 images of 20 common Chinese herbal medicine fruits through web crawling. All images are annotated by TCM experts, making them suitable for training and testing Chinese herbal medicine identification methods. Furthermore, this study establishes another dataset (Dataset 2) of 400 images by taking pictures using smartphones to provide materials for the practical efficacy evaluation of Chinese herbal medicine identification methods. The two datasets form a Ningbo Traditional Chinese Medicine Chinese Herb Medicine (NB-TCM-CHM) Dataset. In Dataset 1 and Dataset 2, each type of Chinese medicine herb is stored in a separate folder, with the folder named after its name. The dataset can be used to develop Chinese herbal medicine identification algorithms based on deep learning and evaluate the performance of Chinese herbal medicine identification methods.

Keywords