Scientific Data (Jun 2023)

POLCOVID: a multicenter multiclass chest X-ray database (Poland, 2020–2021)

  • Aleksandra Suwalska,
  • Joanna Tobiasz,
  • Wojciech Prazuch,
  • Marek Socha,
  • Pawel Foszner,
  • Damian Piotrowski,
  • Katarzyna Gruszczynska,
  • Magdalena Sliwinska,
  • Jerzy Walecki,
  • Tadeusz Popiela,
  • Grzegorz Przybylski,
  • Mateusz Nowak,
  • Piotr Fiedor,
  • Malgorzata Pawlowska,
  • Robert Flisiak,
  • Krzysztof Simon,
  • Gabriela Zapolska,
  • Barbara Gizycka,
  • Edyta Szurowska,
  • for the POLCOVID Study Group,
  • Michal Marczyk,
  • Andrzej Cieszanowski,
  • Joanna Polanska

DOI
https://doi.org/10.1038/s41597-023-02229-5
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 9

Abstract

Read online

Abstract The outbreak of the SARS-CoV-2 pandemic has put healthcare systems worldwide to their limits, resulting in increased waiting time for diagnosis and required medical assistance. With chest radiographs (CXR) being one of the most common COVID-19 diagnosis methods, many artificial intelligence tools for image-based COVID-19 detection have been developed, often trained on a small number of images from COVID-19-positive patients. Thus, the need for high-quality and well-annotated CXR image databases increased. This paper introduces POLCOVID dataset, containing chest X-ray (CXR) images of patients with COVID-19 or other-type pneumonia, and healthy individuals gathered from 15 Polish hospitals. The original radiographs are accompanied by the preprocessed images limited to the lung area and the corresponding lung masks obtained with the segmentation model. Moreover, the manually created lung masks are provided for a part of POLCOVID dataset and the other four publicly available CXR image collections. POLCOVID dataset can help in pneumonia or COVID-19 diagnosis, while the set of matched images and lung masks may serve for the development of lung segmentation solutions.