Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer

Mohamed Yacin Sikkandar; Sankar Ganesh Sundaram; Ahmad Alassaf; Ibrahim AlMohimeed; Khalid Alhussaini; Adham Aleid; Salem Ali Alolayan; P. Ramkumar; Meshal Khalaf Almutairi; S. Sabarunisha Begum

doi:10.1038/s41598-024-57993-0

Scientific Reports (Mar 2024)

Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer

Mohamed Yacin Sikkandar,
Sankar Ganesh Sundaram,
Ahmad Alassaf,
Ibrahim AlMohimeed,
Khalid Alhussaini,
Adham Aleid,
Salem Ali Alolayan,
P. Ramkumar,
Meshal Khalaf Almutairi,
S. Sabarunisha Begum

Affiliations

Mohamed Yacin Sikkandar: Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University
Sankar Ganesh Sundaram: Department of Artificial Intelligence and Data Science, KPR Institute of Engineering and Technology
Ahmad Alassaf: Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University
Ibrahim AlMohimeed: Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University
Khalid Alhussaini: Department of Biomedical Technology, College of Applied Medical Sciences, King Saud University
Adham Aleid: Department of Biomedical Technology, College of Applied Medical Sciences, King Saud University
Salem Ali Alolayan: Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University
P. Ramkumar: Department of Computer Science and Engineering, Sri Sairam College of Engineering
Meshal Khalaf Almutairi: Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University
S. Sabarunisha Begum: Department of Biotechnology, P.S.R. Engineering College

DOI: https://doi.org/10.1038/s41598-024-57993-0
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Polyp detection is a challenging task in the diagnosis of Colorectal Cancer (CRC), and it demands clinical expertise due to the diverse nature of polyps. The recent years have witnessed the development of automated polyp detection systems to assist the experts in early diagnosis, considerably reducing the time consumption and diagnostic errors. In automated CRC diagnosis, polyp segmentation is an important step which is carried out with deep learning segmentation models. Recently, Vision Transformers (ViT) are slowly replacing these models due to their ability to capture long range dependencies among image patches. However, the existing ViTs for polyp do not harness the inherent self-attention abilities and incorporate complex attention mechanisms. This paper presents Polyp-Vision Transformer (Polyp-ViT), a novel Transformer model based on the conventional Transformer architecture, which is enhanced with adaptive mechanisms for feature extraction and positional embedding. Polyp-ViT is tested on the Kvasir-seg and CVC-Clinic DB Datasets achieving segmentation accuracies of 0.9891 ± 0.01 and 0.9875 ± 0.71 respectively, outperforming state-of-the-art models. Polyp-ViT is a prospective tool for polyp segmentation which can be adapted to other medical image segmentation tasks as well due to its ability to generalize well.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords