Applied Sciences (Nov 2020)
PICCOLO White-Light and Narrow-Band Imaging Colonoscopic Dataset: A Performance Comparative of Models and Datasets
Abstract
Colorectal cancer is one of the world leading death causes. Fortunately, an early diagnosis allows for effective treatment, increasing the survival rate. Deep learning techniques have shown their utility for increasing the adenoma detection rate at colonoscopy, but a dataset is usually required so the model can automatically learn features that characterize the polyps. In this work, we present the PICCOLO dataset, that comprises 3433 manually annotated images (2131 white-light images 1302 narrow-band images), originated from 76 lesions from 40 patients, which are distributed into training (2203), validation (897) and test (333) sets assuring patient independence between sets. Furthermore, clinical metadata are also provided for each lesion. Four different models, obtained by combining two backbones and two encoder–decoder architectures, are trained with the PICCOLO dataset and other two publicly available datasets for comparison. Results are provided for the test set of each dataset. Models trained with the PICCOLO dataset have a better generalization capacity, as they perform more uniformly along test sets of all datasets, rather than obtaining the best results for its own test set. This dataset is available at the website of the Basque Biobank, so it is expected that it will contribute to the further development of deep learning methods for polyp detection, localisation and classification, which would eventually result in a better and earlier diagnosis of colorectal cancer, hence improving patient outcomes.
Keywords