Performance of Convolutional Neural Networks for Polyp Localization on Public Colonoscopy Image Datasets

Alba Nogueira-Rodríguez; Miguel Reboiro-Jato; Daniel Glez-Peña; Hugo López-Fernández

doi:10.3390/diagnostics12040898

Diagnostics (Apr 2022)

Performance of Convolutional Neural Networks for Polyp Localization on Public Colonoscopy Image Datasets

Alba Nogueira-Rodríguez,
Miguel Reboiro-Jato,
Daniel Glez-Peña,
Hugo López-Fernández

Affiliations

Alba Nogueira-Rodríguez: CINBIO, Department of Computer Science, ESEI-Escuela Superior de Ingeniería Informática, Universidade de Vigo, 32004 Ourense, Spain
Miguel Reboiro-Jato: CINBIO, Department of Computer Science, ESEI-Escuela Superior de Ingeniería Informática, Universidade de Vigo, 32004 Ourense, Spain
Daniel Glez-Peña: CINBIO, Department of Computer Science, ESEI-Escuela Superior de Ingeniería Informática, Universidade de Vigo, 32004 Ourense, Spain
Hugo López-Fernández: CINBIO, Department of Computer Science, ESEI-Escuela Superior de Ingeniería Informática, Universidade de Vigo, 32004 Ourense, Spain

DOI: https://doi.org/10.3390/diagnostics12040898
Journal volume & issue: Vol. 12, no. 4
p. 898

Abstract

Read online

Colorectal cancer is one of the most frequent malignancies. Colonoscopy is the de facto standard for precancerous lesion detection in the colon, i.e., polyps, during screening studies or after facultative recommendation. In recent years, artificial intelligence, and especially deep learning techniques such as convolutional neural networks, have been applied to polyp detection and localization in order to develop real-time CADe systems. However, the performance of machine learning models is very sensitive to changes in the nature of the testing instances, especially when trying to reproduce results for totally different datasets to those used for model development, i.e., inter-dataset testing. Here, we report the results of testing of our previously published polyp detection model using ten public colonoscopy image datasets and analyze them in the context of the results of other 20 state-of-the-art publications using the same datasets. The F1-score of our recently published model was 0.88 when evaluated on a private test partition, i.e., intra-dataset testing, but it decayed, on average, by 13.65% when tested on ten public datasets. In the published research, the average intra-dataset F1-score is 0.91, and we observed that it also decays in the inter-dataset setting to an average F1-score of 0.83.

Published in Diagnostics

ISSN: 2075-4418 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Medicine: Medicine (General)
Website: http://www.mdpi.com/journal/diagnostics

About the journal

Abstract

Keywords