Endoscopy International Open (Feb 2020)

CAD-CAP: a 25,000-image database serving the development of artificial intelligence for capsule endoscopy

  • Romain Leenhardt,
  • Cynthia Li,
  • Jean-Philippe Le Mouel,
  • Gabriel Rahmi,
  • Jean Christophe Saurin,
  • Franck Cholet,
  • Arnaud Boureille,
  • Xavier Amiot,
  • Michel Delvaux,
  • Clotilde Duburque,
  • Chloé Leandri,
  • Romain Gérard,
  • Stéphane Lecleire,
  • Farida Mesli,
  • Isabelle Nion-Larmurier,
  • Olivier Romain,
  • Sylvie Sacher-Huvelin,
  • Camille Simon-Shane,
  • Geoffroy Vanbiervliet,
  • Philippe Marteau,
  • Aymeric Histace,
  • Xavier Dray

DOI
https://doi.org/10.1055/a-1035-9088
Journal volume & issue
Vol. 08, no. 03
pp. E415 – E420

Abstract

Read online

Background and study aims Capsule endoscopy (CE) is the preferred method for small bowel (SB) exploration. With a mean number of 50,000 SB frames per video, SBCE reading is time-consuming and tedious (30 to 60 minutes per video). We describe a large, multicenter database named CAD-CAP (Computer-Assisted Diagnosis for CAPsule Endoscopy, CAD-CAP). This database aims to serve the development of CAD tools for CE reading. Materials and methods Twelve French endoscopy centers were involved. All available third-generation SB-CE videos (Pillcam, Medtronic) were retrospectively selected from these centers and deidentified. Any pathological frame was extracted and included in the database. Manual segmentation of findings within these frames was performed by two pre-med students trained and supervised by an expert reader. All frames were then classified by type and clinical relevance by a panel of three expert readers. An automated extraction process was also developed to create a dataset of normal, proofread, control images from normal, complete, SB-CE videos. Results Four-thousand-one-hundred-and-seventy-four SB-CE were included. Of them, 1,480 videos (35 %) containing at least one pathological finding were selected. Findings from 5,184 frames (with their short video sequences) were extracted and delimited: 718 frames with fresh blood, 3,097 frames with vascular lesions, and 1,369 frames with inflammatory and ulcerative lesions. Twenty-thousand normal frames were extracted from 206 SB-CE normal videos. CAD-CAP has already been used for development of automated tools for angiectasia detection and also for two international challenges on medical computerized analysis.