IET Computer Vision (Feb 2024)

Lite‐weight semantic segmentation with AG self‐attention

  • Bing Liu,
  • Yansheng Gao,
  • Hai Li,
  • Zhaohao Zhong,
  • Hongwei Zhao

DOI
https://doi.org/10.1049/cvi2.12225
Journal volume & issue
Vol. 18, no. 1
pp. 72 – 83

Abstract

Read online

Abstract Due to the large computational and GPUs memory cost of semantic segmentation, some works focus on designing a lite weight model to achieve a good trade‐off between computational cost and accuracy. A common method is to combined CNN and vision transformer. However, these methods ignore the contextual information of multi receptive fields. And existing methods often fail to inject detailed information losses in the downsampling of multi‐scale feature. To fix these issues, we propose AG Self‐Attention, which is Enhanced Atrous Self‐Attention (EASA), and Gate Attention. AG Self‐Attention adds the contextual information of multi receptive fields into the global semantic feature. Specifically, the Enhanced Atrous Self‐Attention uses weight shared atrous convolution with different atrous rates to get the contextual information under the specific different receptive fields. Gate Attention introduces gating mechanism to inject detailed information into the global semantic feature and filter detailed information by producing “fusion” gate and “update” gate. In order to prove our insight. We conduct numerous experiments in common semantic segmentation datasets, consisting of ADE20 K, COCO‐stuff, PASCAL Context, Cityscapes, to show that our method achieves state‐of‐the‐art performance and achieve a good trade‐off between computational cost and accuracy.

Keywords