IEEE Access (Jan 2024)

GIFCOS-DT: One Stage Detection of Gastrointestinal Tract Lesions From Endoscopic Images With Distance Transform

  • Thanh-Hai Tran,
  • Danh Huy Vu,
  • Minh Hanh Tran,
  • Viet Hang Dao,
  • Hai Vu,
  • Thi Thuy Nguyen

DOI
https://doi.org/10.1109/ACCESS.2024.3491833
Journal volume & issue
Vol. 12
pp. 163698 – 163714

Abstract

Read online

This study aims at developing a computer-aided diagnostic system based on deep learning techniques for detecting various typical lesions during endoscopic examinations in the human gastrointestinal tract. We propose a lesion detection model, namely called GIFCOS-DT, that is built upon a one-stage backbone for object detection (Fully Convolutional One-Stage Object Detection - FCOS). For the proposed model, to deal with the diverse shapes and appearance of the lesions, we introduce a new loss function based on Distance Transform, that better describes the elongated or curved shapes of lesions than the common loss functions like Intersection of Union or centroid loss. We then deploy the detection model on an embedded device that connects to the endoscopic machine to assist endoscopists during examinations. A multithread technique is employed to accelerate the processing times of all steps of the system. Extensive experiments have been conducted on two challenging datasets, the benchmark dataset (Kvasir-SEG) and our newly collected dataset (IGH_GIEndoLesion-SEG), which include various typical lesions of the gastrointestinal (GI) tract (reflux esophagitis, esophageal cancer, helicobacter pylori negative gastritis, helicobacter pylori positive gastritis, gastric cancer, duodenal ulcer, and colorectal polyps). Experimental results show that our proposed methods outperform the original FCOS by 4.2% and 7.2% on Kvasir-SEG and our collected dataset respectively in terms of the average $AP_{50}$ score. On the Kvasir-SEG dataset, the GIFCOS-DT outperforms state-of-the-art detectors such as Faster R-CNN, DETR, YOLOv3, and YOLOv4. Our developed supporting system for lesion detection can run at 14.85 FPS on an embedded Jetson AGX Xavier or 31.92 FPS on an RTX 3090. The detection results of various types of lesions are promising, mostly on malignant lesions such as gastric cancers. The proposed system can be deployed as an assistant tool in endoscopy to reduce missed detection of lesions. Our code is available at https://github.com/hanhtran201/GIFCOS-DT.

Keywords