IEEE Access (Jan 2025)

Real-Time Multi-Task Deep Learning Model for Polyp Detection, Characterization, and Size Estimation

  • Phanukorn Sunthornwetchapong,
  • Kasichon Hombubpha,
  • Kasenee Tiankanon,
  • Satimai Aniwan,
  • Pasit Jakkrawankul,
  • Natawut Nupairoj,
  • Peerapon Vateekul,
  • Rungsun Rerknimitr

DOI
https://doi.org/10.1109/ACCESS.2025.3527720
Journal volume & issue
Vol. 13
pp. 8469 – 8481

Abstract

Read online

While performing a colonoscopy, there are many tasks to be done: finding polyps, classifying them, and deciding the next procedure for the polyps, whether to incise them or not. Such tasks are challenging for fellow doctors. All these three tasks can have an intrapersonal error, which varies among endoscopists. A proven method for enhancing performance is computer-aided detection and a diagnosis system for endoscopists, which tends to be a real-time system. In this work, we present a modified convolutional neural network (CNN) based deep learning (DL) model to perform these tasks in real-time, utilizing existing object detection models: YOLOv5 and YOLOv8. For the various tasks, the models are trained using datasets with incomplete labels, leading to a comparison of different training strategies. Our model, YOLOv8, achieved an F1-score of 95.96% for the polyp detection task, 85.24% F1-score for the polyp classification task, and 78.41% macro F1-score for the polyp size estimation task. Such results, when compared with fellow doctors’ findings proved superior in both accuracy and macro F1-score, maintaining a real-time inference speed.

Keywords