Convolutional neural network model based on radiological images to support COVID-19 diagnosis: Evaluating database biases.

Caio B S Maior; João M M Santana; Isis D Lins; Márcio J C Moura

doi:10.1371/journal.pone.0247839

PLoS ONE (Jan 2021)

Convolutional neural network model based on radiological images to support COVID-19 diagnosis: Evaluating database biases.

Caio B S Maior,
João M M Santana,
Isis D Lins,
Márcio J C Moura

Affiliations

Caio B S Maior
João M M Santana
Isis D Lins
Márcio J C Moura

DOI: https://doi.org/10.1371/journal.pone.0247839
Journal volume & issue: Vol. 16, no. 3
p. e0247839

Abstract

Read online

As SARS-CoV-2 has spread quickly throughout the world, the scientific community has spent major efforts on better understanding the characteristics of the virus and possible means to prevent, diagnose, and treat COVID-19. A valid approach presented in the literature is to develop an image-based method to support COVID-19 diagnosis using convolutional neural networks (CNN). Because the availability of radiological data is rather limited due to the novelty of COVID-19, several methodologies consider reduced datasets, which may be inadequate, biasing the model. Here, we performed an analysis combining six different databases using chest X-ray images from open datasets to distinguish images of infected patients while differentiating COVID-19 and pneumonia from 'no-findings' images. In addition, the performance of models created from fewer databases, which may imperceptibly overestimate their results, is discussed. Two CNN-based architectures were created to process images of different sizes (512 × 512, 768 × 768, 1024 × 1024, and 1536 × 1536). Our best model achieved a balanced accuracy (BA) of 87.7% in predicting one of the three classes ('no-findings', 'COVID-19', and 'pneumonia') and a specific balanced precision of 97.0% for 'COVID-19' class. We also provided binary classification with a precision of 91.0% for detection of sick patients (i.e., with COVID-19 or pneumonia) and 98.4% for COVID-19 detection (i.e., differentiating from 'no-findings' or 'pneumonia'). Indeed, despite we achieved an unrealistic 97.2% BA performance for one specific case, the proposed methodology of using multiple databases achieved better and less inflated results than from models with specific image datasets for training. Thus, this framework is promising for a low-cost, fast, and noninvasive means to support the diagnosis of COVID-19.

Published in PLoS ONE

ISSN: 1932-6203 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine; Science
Website: https://journals.plos.org/plosone/

About the journal