Shanghai Jiaotong Daxue xuebao. Yixue ban (Jun 2024)

Comparative study on methods for colon polyp endoscopic image segmentation and classification based on deep learning

  • CHEN Jian,
  • WANG Zhenni,
  • XIA Kaijian,
  • WANG Ganhong,
  • LIU Luojie,
  • XU Xiaodan

DOI
https://doi.org/10.3969/j.issn.1674-8115.2024.06.012
Journal volume & issue
Vol. 44, no. 6
pp. 762 – 772

Abstract

Read online

Objective·To compare the performance of various deep learning methods in the segmentation and classification of colorectal polyp endoscopic images, and identify the most effective approach.Methods·Four colorectal polyp datasets were collected from three hospitals, encompassing 1 534 static images and 15 videos. All samples were pathologically validated and categorized into two types: serrated lesions and adenomatous polyps. Polygonal annotations were performed by using the LabelMe tool, and the annotated results were converted into integer mask formats. These data were utilized to train various architectures of deep neural networks, including convolutional neural network (CNN), Transformers, and their fusion, aiming to develop an effective semantic segmentation model. Multiple performance indicators for automatic diagnosis of colon polyps by different architecture models were compared, including mIoU, aAcc, mAcc, mDice, mFscore, mPrecision and mRecall.Results·Four different architectures of semantic segmentation models were developed, including two deep CNN architectures (Fast-SCNN and DeepLabV3plus), one Transformer architecture (Segformer), and one hybrid architecture (KNet). In a comprehensive performance evaluation of 291 test images, KNet achieved the highest mIoU of 84.59%, significantly surpassing Fast-SCNN (75.32%), DeepLabV3plus (78.63%), and Segformer (80.17%). Across the categories of “background”, “serrated lesions” and “adenomatous polyps” , KNet's intersection over union (IoU) were 98.91%, 74.12%, and 80.73%, respectively, all exceeding other models. Additionally, KNet performed excellently in key performance metrics, with aAcc, mAcc, mDice, mFscore, and mRecall reaching 98.59%, 91.24%, 91.31%, 91.31%, and 91.24%, respectively, all superior to other models. Although its mPrecision of 91.46% was not the most outstanding, KNet's overall performance remained leading. In inference testing on 80 external test images, KNet maintained an mIoU of 81.53%, demonstrating strong generalization capabilities.Conclusion·The semantic segmentation model of colorectal polyp endoscopic images constructed by deep neural network based on KNet hybrid architecture, exhibits superior predictive performance, demonstrating its potential as an efficient tool for detecting colorectal polyps.

Keywords