Factors determining generalization in deep learning models for scoring COVID-CT images

Michael James Horry; Subrata Chakraborty; Biswajeet Pradhan; Maryam Fallahpoor; Hossein Chegeni; Manoranjan Paul

doi:10.3934/mbe.2021456

Mathematical Biosciences and Engineering (Oct 2021)

Factors determining generalization in deep learning models for scoring COVID-CT images

Michael James Horry,
Subrata Chakraborty,
Biswajeet Pradhan,
Maryam Fallahpoor ,
Hossein Chegeni,
Manoranjan Paul

Affiliations

Michael James Horry: 1. Center for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Australia
Subrata Chakraborty: 1. Center for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Australia
Biswajeet Pradhan: 1. Center for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Australia 2. Center of Excellence for Climate Change Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia 3. Earth Observation Center, Institute of Climate Change, Universiti Kebangsaan Malaysia, Selangor 43600, Malaysia
Maryam Fallahpoor: 1. Center for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Australia
Hossein Chegeni: 4. Fellowship of Interventional Radiology Imaging Center, IranMehr General Hospital, Iran
Manoranjan Paul: 5. Machine Vision and Digital Health (MaViDH), School of Computing, Mathematics, and Engineering, Charles Sturt University, Australia

DOI: https://doi.org/10.3934/mbe.2021456
Journal volume & issue: Vol. 18, no. 6
pp. 9264 – 9293

Abstract

Read online

The COVID-19 pandemic has inspired unprecedented data collection and computer vision modelling efforts worldwide, focused on the diagnosis of COVID-19 from medical images. However, these models have found limited, if any, clinical application due in part to unproven generalization to data sets beyond their source training corpus. This study investigates the generalizability of deep learning models using publicly available COVID-19 Computed Tomography data through cross dataset validation. The predictive ability of these models for COVID-19 severity is assessed using an independent dataset that is stratified for COVID-19 lung involvement. Each inter-dataset study is performed using histogram equalization, and contrast limited adaptive histogram equalization with and without a learning Gabor filter. We show that under certain conditions, deep learning models can generalize well to an external dataset with F1 scores up to 86%. The best performing model shows predictive accuracy of between 75% and 96% for lung involvement scoring against an external expertly stratified dataset. From these results we identify key factors promoting deep learning generalization, being primarily the uniform acquisition of training images, and secondly diversity in CT slice position.

Published in Mathematical Biosciences and Engineering

ISSN: 1551-0018 (Online)
Publisher: AIMS Press
Country of publisher: United States
LCC subjects: Technology: Chemical technology: Biotechnology; Science: Mathematics
Website: https://www.aimspress.com/journal/MBE

About the journal

Abstract

Keywords