Frontiers in Oncology (Oct 2022)

Automatic detection of early gastric cancer in endoscopy based on Mask region-based convolutional neural networks (Mask R-CNN)(with video)

  • Jing Jin,
  • Qianqian Zhang,
  • Bill Dong,
  • Tao Ma,
  • Xuecan Mei,
  • Xi Wang,
  • Shaofang Song,
  • Jie Peng,
  • Aijiu Wu,
  • Lanfang Dong,
  • Derun Kong

DOI
https://doi.org/10.3389/fonc.2022.927868
Journal volume & issue
Vol. 12

Abstract

Read online

The artificial intelligence (AI)-assisted endoscopic detection of early gastric cancer (EGC) has been preliminarily developed. The currently used algorithms still exhibit limitations of large calculation and low-precision expression. The present study aimed to develop an endoscopic automatic detection system in EGC based on a mask region-based convolutional neural network (Mask R-CNN) and to evaluate the performance in controlled trials. For this purpose, a total of 4,471 white light images (WLIs) and 2,662 narrow band images (NBIs) of EGC were obtained for training and testing. In total, 10 of the WLIs (videos) were obtained prospectively to examine the performance of the RCNN system. Furthermore, 400 WLIs were randomly selected for comparison between the Mask R-CNN system and doctors. The evaluation criteria included accuracy, sensitivity, specificity, positive predictive value and negative predictive value. The results revealed that there were no significant differences between the pathological diagnosis with the Mask R-CNN system in the WLI test (χ2 = 0.189, P=0.664; accuracy, 90.25%; sensitivity, 91.06%; specificity, 89.01%) and in the NBI test (χ2 = 0.063, P=0.802; accuracy, 95.12%; sensitivity, 97.59%). Among 10 WLI real-time videos, the speed of the test videos was up to 35 frames/sec, with an accuracy of 90.27%. In a controlled experiment of 400 WLIs, the sensitivity of the Mask R-CNN system was significantly higher than that of experts (χ2 = 7.059, P=0.000; 93.00% VS 80.20%), and the specificity was higher than that of the juniors (χ2 = 9.955, P=0.000, 82.67% VS 71.87%), and the overall accuracy rate was higher than that of the seniors (χ2 = 7.009, P=0.000, 85.25% VS 78.00%). On the whole, the present study demonstrates that the Mask R-CNN system exhibited an excellent performance status for the detection of EGC, particularly for the real-time analysis of WLIs. It may thus be effectively applied to clinical settings.

Keywords