BMC Gastroenterology (Aug 2024)

AI support for colonoscopy quality control using CNN and transformer architectures

  • Jian Chen,
  • Ganhong Wang,
  • Jingjie Zhou,
  • Zihao Zhang,
  • Yu Ding,
  • Kaijian Xia,
  • Xiaodan Xu

DOI
https://doi.org/10.1186/s12876-024-03354-0
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background Construct deep learning models for colonoscopy quality control using different architectures and explore their decision-making mechanisms. Methods A total of 4,189 colonoscopy images were collected from two medical centers, covering different levels of bowel cleanliness, the presence of polyps, and the cecum. Using these data, eight pre-trained models based on CNN and Transformer architectures underwent transfer learning and fine-tuning. The models’ performance was evaluated using metrics such as AUC, Precision, and F1 score. Perceptual hash functions were employed to detect image changes, enabling real-time monitoring of colonoscopy withdrawal speed. Model interpretability was analyzed using techniques such as Grad-CAM and SHAP. Finally, the best-performing model was converted to ONNX format and deployed on device terminals. Results The EfficientNetB2 model outperformed other architectures on the validation set, achieving an accuracy of 0.992. It surpassed models based on other CNN and Transformer architectures. The model’s precision, recall, and F1 score were 0.991, 0.989, and 0.990, respectively. On the test set, the EfficientNetB2 model achieved an average AUC of 0.996, with a precision of 0.948 and a recall of 0.952. Interpretability analysis showed the specific image regions the model used for decision-making. The model was converted to ONNX format and deployed on device terminals, achieving an average inference speed of over 60 frames per second. Conclusions The AI-assisted quality system, based on the EfficientNetB2 model, integrates four key quality control indicators for colonoscopy. This integration enables medical institutions to comprehensively manage and enhance these indicators using a single model, showcasing promising potential for clinical applications.

Keywords