Data (Jul 2022)

Annotations of Lung Abnormalities in the Shenzhen Chest X-ray Dataset for Computer-Aided Screening of Pulmonary Diseases

  • Feng Yang,
  • Pu Xuan Lu,
  • Min Deng,
  • Yì Xiáng J. Wáng,
  • Sivaramakrishnan Rajaraman,
  • Zhiyun Xue,
  • Les R. Folio,
  • Sameer K. Antani,
  • Stefan Jaeger

DOI
https://doi.org/10.3390/data7070095
Journal volume & issue
Vol. 7, no. 7
p. 95

Abstract

Read online

Developments in deep learning techniques have led to significant advances in automated abnormality detection in radiological images and paved the way for their potential use in computer-aided diagnosis (CAD) systems. However, the development of CAD systems for pulmonary tuberculosis (TB) diagnosis is hampered by the lack of training data that is of good visual and diagnostic quality, of sufficient size, variety, and, where relevant, containing fine-region annotations. This study presents a collection of annotations/segmentations of pulmonary radiological manifestations that are consistent with TB in the publicly available and widely used Shenzhen chest X-ray (CXR) dataset made available by the U.S. National Library of Medicine and obtained via a research collaboration with No. 3. People’s Hospital Shenzhen, China. The goal of releasing these annotations is to advance the state of the art for image segmentation methods toward improving the performance of the fine-grained segmentation of TB-consistent findings in digital chest X-ray images. The annotation collection comprises the following: (1) annotation files in JavaScript Object Notation (JSON) format that indicate locations and shapes of 19 lung pattern abnormalities for 336 TB patients; (2) mask files saved in PNG format for each abnormality per TB patient; and (3) a comma-separated values (CSV) file that summarizes lung abnormality types and numbers per TB patient. To the best of our knowledge, this is the first collection of pixel-level annotations of TB-consistent findings in CXRs.

Keywords