IEEE Access (Jan 2024)

Polyp Segmentation With the FCB-SwinV2 Transformer

  • Kerr Fitzgerald,
  • Jorge Bernal,
  • Aymeric Histace,
  • Bogdan J. Matuszewski

DOI
https://doi.org/10.1109/ACCESS.2024.3376228
Journal volume & issue
Vol. 12
pp. 38927 – 38943

Abstract

Read online

Polyp segmentation within colonoscopy video frames using deep learning models has the potential to automate colonoscopy screening procedures. This could help improve the early lesion detection rate and in vivo characterization of polyps which could develop into colorectal cancer. Recent state-of-the-art deep learning polyp segmentation models have combined Convolutional Neural Network (CNN) architectures and Transformer Network (TN) architectures. Motivated by the aim of improving the performance of polyp segmentation models and their robustness to data variations beyond those covered during training, we propose a new CNN-TN hybrid model named the FCB-SwinV2 Transformer. This model was created by making extensive modifications to the recent state-of-the-art FCN-Transformer, including replacing the TN branch architecture with a SwinV2 U-Net. The performance of the FCB-SwinV2 Transformer is evaluated on the popular colonoscopy segmentation benchmarking datasets Kvasir-SEG, CVC-ClinicDB and ETIS-LaribPolypDB. Generalizability tests are also conducted to determine if models can maintain accuracy when evaluated on data outside of the training distribution. The FCB-SwinV2 Transformer consistently achieves higher mean Dice and mean IoU scores when compared to other models reported in literature and therefore represents new state-of-the-art performance. The importance of understanding subtleties in evaluation metrics and dataset partitioning are also demonstrated and discussed. Code available: https://github.com/KerrFitzgerald/Polyp_FCB-SwinV2Transformer

Keywords