Diagnostics (Nov 2021)

Inter-Variability Study of COVLIAS 1.0: Hybrid Deep Learning Models for COVID-19 Lung Segmentation in Computed Tomography

  • Jasjit S. Suri,
  • Sushant Agarwal,
  • Pranav Elavarthi,
  • Rajesh Pathak,
  • Vedmanvitha Ketireddy,
  • Marta Columbu,
  • Luca Saba,
  • Suneet K. Gupta,
  • Gavino Faa,
  • Inder M. Singh,
  • Monika Turk,
  • Paramjit S. Chadha,
  • Amer M. Johri,
  • Narendra N. Khanna,
  • Klaudija Viskovic,
  • Sophie Mavrogeni,
  • John R. Laird,
  • Gyan Pareek,
  • Martin Miner,
  • David W. Sobel,
  • Antonella Balestrieri,
  • Petros P. Sfikakis,
  • George Tsoulfas,
  • Athanasios Protogerou,
  • Durga Prasanna Misra,
  • Vikas Agarwal,
  • George D. Kitas,
  • Jagjit S. Teji,
  • Mustafa Al-Maini,
  • Surinder K. Dhanjil,
  • Andrew Nicolaides,
  • Aditya Sharma,
  • Vijay Rathore,
  • Mostafa Fatemi,
  • Azra Alizad,
  • Pudukode R. Krishnan,
  • Ferenc Nagy,
  • Zoltan Ruzsa,
  • Archna Gupta,
  • Subbaram Naidu,
  • Mannudeep K. Kalra

DOI
https://doi.org/10.3390/diagnostics11112025
Journal volume & issue
Vol. 11, no. 11
p. 2025

Abstract

Read online

Background: For COVID-19 lung severity, segmentation of lungs on computed tomography (CT) is the first crucial step. Current deep learning (DL)-based Artificial Intelligence (AI) models have a bias in the training stage of segmentation because only one set of ground truth (GT) annotations are evaluated. We propose a robust and stable inter-variability analysis of CT lung segmentation in COVID-19 to avoid the effect of bias. Methodology: The proposed inter-variability study consists of two GT tracers for lung segmentation on chest CT. Three AI models, PSP Net, VGG-SegNet, and ResNet-SegNet, were trained using GT annotations. We hypothesized that if AI models are trained on the GT tracings from multiple experience levels, and if the AI performance on the test data between these AI models is within the 5% range, one can consider such an AI model robust and unbiased. The K5 protocol (training to testing: 80%:20%) was adapted. Ten kinds of metrics were used for performance evaluation. Results: The database consisted of 5000 CT chest images from 72 COVID-19-infected patients. By computing the coefficient of correlations (CC) between the output of the two AI models trained corresponding to the two GT tracers, computing their differences in their CC, and repeating the process for all three AI-models, we show the differences as 0%, 0.51%, and 2.04% (all PSP Net > VGG-SegNet. Conclusions: The AI models were clinically robust and stable during the inter-variability analysis on the CT lung segmentation on COVID-19 patients.

Keywords