Multimodal Deep Convolutional Neural Network Pipeline for AI-Assisted Early Detection of Oral Cancer

G. A. I. Devindi; D. M. D. R. Dissanayake; S. N. Liyanage; F. B. A. H. Francis; M. B. D. Pavithya; N. S. Piyarathne; P. V. K. S. Hettiarachchi; R. M. S. G. K. Rasnayaka; R. D. Jayasinghe; R. G. Ragel; I. Nawinne

doi:10.1109/ACCESS.2024.3454338

IEEE Access (Jan 2024)

Multimodal Deep Convolutional Neural Network Pipeline for AI-Assisted Early Detection of Oral Cancer

G. A. I. Devindi,
D. M. D. R. Dissanayake,
S. N. Liyanage,
F. B. A. H. Francis,
M. B. D. Pavithya,
N. S. Piyarathne,
P. V. K. S. Hettiarachchi,
R. M. S. G. K. Rasnayaka,
R. D. Jayasinghe,
R. G. Ragel,
I. Nawinne

Affiliations

G. A. I. Devindi: ORCiD; Department of Computer Engineering, Faculty of Engineering, University of Peradeniya, Peradeniya, Sri Lanka
D. M. D. R. Dissanayake: Department of Computer Engineering, Faculty of Engineering, University of Peradeniya, Peradeniya, Sri Lanka
S. N. Liyanage: ORCiD; Department of Computer Engineering, Faculty of Engineering, University of Peradeniya, Peradeniya, Sri Lanka
F. B. A. H. Francis: ORCiD; Department of Computer Engineering, Faculty of Engineering, University of Peradeniya, Peradeniya, Sri Lanka
M. B. D. Pavithya: Department of Information Technology, Uppsala University, Uppsala, Sweden
N. S. Piyarathne: ORCiD; Department of Basic Sciences, Faculty of Dental Sciences, Center for Research in Oral Cancer, University of Peradeniya, Peradeniya, Sri Lanka
P. V. K. S. Hettiarachchi: Frazer Institute—Translational Research Institute, Faculty of Medicine, The University of Queensland, Brisbane, QLD, Australia
R. M. S. G. K. Rasnayaka: Department of Prosthetic Dentistry, Faculty of Dental Sciences, University of Peradeniya, Peradeniya, Sri Lanka
R. D. Jayasinghe: ORCiD; Department of Oral Medicine and Periodontology, Faculty of Dental Sciences, University of Peradeniya, Peradeniya, Sri Lanka
R. G. Ragel: ORCiD; Department of Computer Engineering, Faculty of Engineering, University of Peradeniya, Peradeniya, Sri Lanka
I. Nawinne: ORCiD; Department of Computer Engineering, Faculty of Engineering, University of Peradeniya, Peradeniya, Sri Lanka

DOI: https://doi.org/10.1109/ACCESS.2024.3454338
Journal volume & issue: Vol. 12
pp. 124375 – 124390

Abstract

Read online

Oral Squamous Cell Carcinoma (OSCC) poses a significant health challenge, with early detection being crucial for effective treatment and improved survival rates. While previous studies have examined the use of standard photographs, such as those from smartphones, for oral lesion classification, they typically rely solely on images, overlooking the potential benefits of incorporating multiple modalities. This study addresses this gap by proposing a multimodal deep-learning pipeline incorporating diverse data sources, including patient metadata, which mimics the diagnostic approach of clinicians in the early detection of oral cancer. The study leverages state-of-the-art image encoders to classify oral lesions into benign and potentially malignant categories. A performance comparison of six pre-trained deep-learning models (MobileNetV3-Large, MixNet-S, ResNet-50, HRNet-W18-C, DenseNet-121, and Inception_v3) is presented. The performance of the proposed pipeline achieved an overall accuracy of 81%, precision of 79%, recall of 79%, F1-score of 78%, and a Matthews Correlation Coefficient (MCC) of 0.57 using the MobileNetV3-Large image encoder. The findings highlight the efficacy of integrating multiple data modalities for more accurate early detection of potential malignancies compared to using only image data. The outcomes could pave the way for improved clinical decision-making and patient outcomes.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords