Complex & Intelligent Systems (Apr 2023)

Mutually reinforcing non-local neural networks for semantic segmentation

  • Tianping Li,
  • Yanjun Wei,
  • Zhaotong Cui,
  • Guanxing Li,
  • Meng Li,
  • Hua Zhang

DOI
https://doi.org/10.1007/s40747-023-01056-w
Journal volume & issue
Vol. 9, no. 5
pp. 6037 – 6049

Abstract

Read online

Abstract The ability to capture pixels' long-distance interdependence is beneficial to semantic segmentation. In addition, semantic segmentation requires the effective use of pixel-to-pixel similarity in the channel direction to enhance pixel regions. Asymmetric Non-local Neural Networks (ANNet) combine multi-scale spatial pyramidal pooling modules and Non-local blocks to reduce model parameters without sacrificing performance. However, ANNet does not consider pixel similarity in the channel direction in the feature map, so its segmentation effect is not ideal. This article proposes a Mutually Reinforcing Non-local Neural Networks (MRNNet) to improve ANNet. MRNNet consists specifically of the channel enhancement regions module (CERM), and the position-enhanced pixels module (PEPM). In contrast to Asymmetric Fusion Non-local Block (AFNB) in ANNet, CERM does not combine the feature maps of the high and low stages, but rather utilizes the auxiliary loss function of ANNet. Calculating the similarity between feature maps in channel direction improves the category representation of feature maps in the channel aspect and reduces matrix multiplication computation. PEPM enhances pixels in the spatial direction of the feature map by calculating the similarity between pixels in the spatial direction of the feature map. Experiments reveal that our segmentation accuracy for cityscapes test data reaches 81.9%. Compared to ANNet, the model's parameters are reduced by 11.35 (M). Given ten different pictures with a size of 2048 × 1024, the average reasoning time of MRNNet is 0.103(s) faster than that of the ANNet model.

Keywords