PLoS ONE (Jan 2022)
A novel abnormality annotation database for COVID-19 affected frontal lung X-rays
Abstract
Consistent clinical observations of characteristic findings of COVID-19 pneumonia on chest X-rays have attracted the research community to strive to provide a fast and reliable method for screening suspected patients. Several machine learning algorithms have been proposed to find the abnormalities in the lungs using chest X-rays specific to COVID-19 pneumonia and distinguish them from other etiologies of pneumonia. However, despite the enormous magnitude of the pandemic, there are very few instances of public databases of COVID-19 pneumonia, and to the best of our knowledge, there is no database with annotation of abnormalities on the chest X-rays of COVID-19 affected patients. Annotated databases of X-rays can be of significant value in the design and development of algorithms for disease prediction. Further, explainability analysis for the performance of existing or new deep learning algorithms will be enhanced significantly with access to ground-truth abnormality annotations. The proposed COVID Abnormality Annotation for X-Rays (CAAXR) database is built upon the BIMCV-COVID19+ database which is a large-scale dataset containing COVID-19+ chest X-rays. The primary contribution of this study is the annotation of the abnormalities in over 1700 frontal chest X-rays. Further, we define protocols for semantic segmentation as well as classification for robust evaluation of algorithms. We provide benchmark results on the defined protocols using popular deep learning models such as DenseNet, ResNet, MobileNet, and VGG for classification, and UNet, SegNet, and Mask-RCNN for semantic segmentation. The classwise accuracy, sensitivity, and AUC-ROC scores are reported for the classification models, and the IoU and DICE scores are reported for the segmentation models.