The Lancet Regional Health. Americas (Jul 2025)

Automated classification of oral potentially malignant disorders and oral squamous cell carcinoma using a convolutional neural network framework: a cross-sectional studyResearch in context

  • Cristina Saldivia-Siracusa,
  • Eduardo Santos Carlos de Souza,
  • Arnaldo Vitor Barros da Silva,
  • Anna Luíza Damaceno Araújo,
  • Caíque Mariano Pedroso,
  • Tarcília Aparecida da Silva,
  • Maria Sissa Pereira Sant'Ana,
  • Felipe Paiva Fonseca,
  • Hélder Antônio Rebelo Pontes,
  • Marcos G. Quiles,
  • Marcio Ajudarte Lopes,
  • Pablo Agustin Vargas,
  • Syed Ali Khurram,
  • Alexander T. Pearson,
  • Mark W. Lingen,
  • Luiz Paulo Kowalski,
  • Keith D. Hunter,
  • André Carlos Ponce de Leon Ferreira de Carvalho,
  • Alan Roger Santos-Silva

DOI
https://doi.org/10.1016/j.lana.2025.101138
Journal volume & issue
Vol. 47
p. 101138

Abstract

Read online

Summary: Background: Artificial Intelligence (AI) models hold promise as useful tools in healthcare practice. We aimed to develop and assess AI models for automatic classification of oral potentially malignant disorders (OPMD) and oral squamous cell carcinoma (OSCC) clinical images through a Deep Learning (DL) approach, and to explore explainability using Gradient-weighted Class Activation Mapping (Grad-CAM). Methods: This study assessed a dataset of 778 clinical images of OPMD and OSCC, divided into training, model optimization, and internal testing subsets with an 8:1:1 proportion. Transfer learning strategies were applied to pre-train 8 convolutional neural networks (CNN). Performance was evaluated by mean accuracy, precision, recall, specificity, F1-score and area under the receiver operating characteristic (AUROC) values. Grad-CAM qualitative appraisal was performed to assess explainability. Findings: ConvNeXt and MobileNet CNNs showed the best performance. Transfer learning strategies enhanced performance for both algorithms, and the greatest model achieved mean accuracy, precision, recall, F1-score and AUROC of 0.799, 0.837, 0.756, 0.794 and 0.863 during internal testing, respectively. MobileNet displayed the lowest computational cost. Grad-CAM analysis demonstrated discrepancies between the best-performing model and the highest explainability model. Interpretation: ConvNeXt and MobileNet DL models accurately distinguished OSCC from OPMD in clinical photographs taken with different types of image-capture devices. Grad-CAM proved to be an outstanding tool to improve performance interpretation. Obtained results suggest that the adoption of DL models in healthcare could aid in diagnostic assistance and decision-making during clinical practice. Funding: This work was supported by FAPESP (2022/13069-8, 2022/07276-0, 2021/14585-7 and 2024/20694-1), CAPES, CNPq (307604/2023-3) and FAPEMIG.

Keywords