IEEE Access (Jan 2025)

Heavy and Lightweight Deep Learning Models for Semantic Segmentation: A Survey

  • Cristina Carunta,
  • Alina Carunta,
  • Calin-Adrian Popa

DOI
https://doi.org/10.1109/ACCESS.2025.3529812
Journal volume & issue
Vol. 13
pp. 17745 – 17765

Abstract

Read online

Semantic segmentation is an important computer vision task due to its numerous real-world applications such as autonomous driving, video surveillance, medical image analysis, robotics, augmented reality, among others, and its popularity increased with the development of deep learning approaches. We provide a detailed review comprising the most significant methods for both heavy and lightweight two-dimensional (2D) semantic segmentation, starting with the introduction of convolutional neural networks until the use of Transformer architecture, the latter being a widely adopted model with state-of-the-art results in several artificial intelligence fields. The methods involved are described from the architectural design perspective, including encoder-decoder architectures, multi-resolution branches approaches, two-pathway encoder architectures, attention-based models, and pyramid-based models. Additionally, some of the most popular datasets and performance metrics are presented. Further, we investigate the limitations of these methods, compare their performance on Pascal VOC 2012, Cityscapes, and ADE20K datasets, and finally indicate future research directions.

Keywords