Noise-induced modality-specific pretext learning for pediatric chest X-ray image classification

Sivaramakrishnan Rajaraman; Zhaohui Liang; Zhiyun Xue; Sameer Antani

doi:10.3389/frai.2024.1419638

Frontiers in Artificial Intelligence (Sep 2024)

Noise-induced modality-specific pretext learning for pediatric chest X-ray image classification

Sivaramakrishnan Rajaraman,
Zhaohui Liang,
Zhiyun Xue,
Sameer Antani

Affiliations

Sivaramakrishnan Rajaraman
Zhaohui Liang
Zhiyun Xue
Sameer Antani

DOI: https://doi.org/10.3389/frai.2024.1419638
Journal volume & issue: Vol. 7

Abstract

Read online

IntroductionDeep learning (DL) has significantly advanced medical image classification. However, it often relies on transfer learning (TL) from models pretrained on large, generic non-medical image datasets like ImageNet. Conversely, medical images possess unique visual characteristics that such general models may not adequately capture.MethodsThis study examines the effectiveness of modality-specific pretext learning strengthened by image denoising and deblurring in enhancing the classification of pediatric chest X-ray (CXR) images into those exhibiting no findings, i.e., normal lungs, or with cardiopulmonary disease manifestations. Specifically, we use a VGG-16-Sharp-U-Net architecture and leverage its encoder in conjunction with a classification head to distinguish normal from abnormal pediatric CXR findings. We benchmark this performance against the traditional TL approach, viz., the VGG-16 model pretrained only on ImageNet. Measures used for performance evaluation are balanced accuracy, sensitivity, specificity, F-score, Matthew’s Correlation Coefficient (MCC), Kappa statistic, and Youden’s index.ResultsOur findings reveal that models developed from CXR modality-specific pretext encoders substantially outperform the ImageNet-only pretrained model, viz., Baseline, and achieve significantly higher sensitivity (p < 0.05) with marked improvements in balanced accuracy, F-score, MCC, Kappa statistic, and Youden’s index. A novel attention-based fuzzy ensemble of the pretext-learned models further improves performance across these metrics (Balanced accuracy: 0.6376; Sensitivity: 0.4991; F-score: 0.5102; MCC: 0.2783; Kappa: 0.2782, and Youden’s index:0.2751), compared to Baseline (Balanced accuracy: 0.5654; Sensitivity: 0.1983; F-score: 0.2977; MCC: 0.1998; Kappa: 0.1599, and Youden’s index:0.1327).DiscussionThe superior results of CXR modality-specific pretext learning and their ensemble underscore its potential as a viable alternative to conventional ImageNet pretraining for medical image classification. Results from this study promote further exploration of medical modality-specific TL techniques in the development of DL models for various medical imaging applications.

Published in Frontiers in Artificial Intelligence

ISSN: 2624-8212 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/artificial-intelligence#

About the journal

Abstract

Keywords