Sensors (Jul 2020)

Improving Temporal Stability and Accuracy for Endoscopic Video Tissue Classification Using Recurrent Neural Networks

  • Tim Boers,
  • Joost van der Putten,
  • Maarten Struyvenberg,
  • Kiki Fockens,
  • Jelmer Jukema,
  • Erik Schoon,
  • Fons van der Sommen,
  • Jacques Bergman,
  • Peter de With

DOI
https://doi.org/10.3390/s20154133
Journal volume & issue
Vol. 20, no. 15
p. 4133

Abstract

Read online

Early Barrett’s neoplasia are often missed due to subtle visual features and inexperience of the non-expert endoscopist with such lesions. While promising results have been reported on the automated detection of this type of early cancer in still endoscopic images, video-based detection using the temporal domain is still open. The temporally stable nature of video data in endoscopic examinations enables to develop a framework that can diagnose the imaged tissue class over time, thereby yielding a more robust and improved model for spatial predictions. We show that the introduction of Recurrent Neural Network nodes offers a more stable and accurate model for tissue classification, compared to classification on individual images. We have developed a customized Resnet18 feature extractor with four types of classifiers: Fully Connected (FC), Fully Connected with an averaging filter (FC Avg (n = 5)), Long Short Term Memory (LSTM) and a Gated Recurrent Unit (GRU). Experimental results are based on 82 pullback videos of the esophagus with 46 high-grade dysplasia patients. Our results demonstrate that the LSTM classifier outperforms the FC, FC Avg (n = 5) and GRU classifier with an average accuracy of 85.9% compared to 82.2%, 83.0% and 85.6%, respectively. The benefit of our novel implementation for endoscopic tissue classification is the inclusion of spatio-temporal information for improved and robust decision making, and it is the first step towards full temporal learning of esophageal cancer detection in endoscopic video.

Keywords