Scientific Data (Jun 2024)

A Clinical Bacterial Dataset for Deep Learning in Microbiological Rapid On-Site Evaluation

  • Xiuli Wang,
  • Yinghan Shi,
  • Shasha Guo,
  • Xuzhong Qu,
  • Fei Xie,
  • Zhimei Duan,
  • Ye Hu,
  • Han Fu,
  • Xin Shi,
  • Tingwei Quan,
  • Kaifei Wang,
  • Lixin Xie

DOI
https://doi.org/10.1038/s41597-024-03370-5
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 8

Abstract

Read online

Abstract Microbiological Rapid On-Site Evaluation (M-ROSE) is based on smear staining and microscopic observation, providing critical references for the diagnosis and treatment of pulmonary infectious disease. Automatic identification of pathogens is the key to improving the quality and speed of M-ROSE. Recent advancements in deep learning have yielded numerous identification algorithms and datasets. However, most studies focus on artificially cultured bacteria and lack clinical data and algorithms. Therefore, we collected Gram-stained bacteria images from lower respiratory tract specimens of patients with lung infections in Chinese PLA General Hospital obtained by M-ROSE from 2018 to 2022 and desensitized images to produce 1705 images (4,912 × 3,684 pixels). A total of 4,833 cocci and 6,991 bacilli were manually labelled and differentiated into negative and positive. In addition, we applied the detection and segmentation networks for benchmark testing. Data and benchmark algorithms we provided that may benefit the study of automated bacterial identification in clinical specimens.