IEEE Access (Jan 2024)

Enhancing Road Scene Segmentation With an Optimized DeepLabV3+

  • Zhe Ren,
  • Libao Wang,
  • Tianming Song,
  • Yihang Li,
  • Jian Zhang,
  • Fengfeng Zhao

DOI
https://doi.org/10.1109/ACCESS.2024.3521597
Journal volume & issue
Vol. 12
pp. 197748 – 197765

Abstract

Read online

Semantic segmentation, as a dense predictive task, is inevitably affected by various external factor, making common road image semantic segmentation models unable to meet dual demands of high accuracy and real-time performance in unstructured road scenarios. To address these issues, this paper proposes an enhanced road scene segmentation method based on DeepLabV3+ that addresses the common trade-offs between accuracy and real-time performance in existing approaches. First, the heavy Xception backbone is replaced with the lightweight MobileNetV2, significantly boosting real-time efficiency while maintaining competitive segmentation accuracy. Second, the Atrous Spatial Pyramid Pooling (ASPP) module is optimized by introducing depthwise separable convolutions and a hierarchical feature fusion strategy, reducing computational complexity and mitigating the grid effect, a limitation in many current models. Finally, a Shuffle Attention mechanism is incorporated to improve the handling of small objects and fine details, such as distant pedestrians or items held by them, enhancing segmentation precision without excessive computational overhead. The method was trained and evaluated on the Cityscapes and CamVid datasets, achieving 84.3% mPA and 41.8 FPS on Cityscapes, and 78.1% mPA and 30.2 FPS on CamVid. These experimental results demonstrate a significant improvement in balancing detection capabilities with real-time performance.

Keywords