IET Signal Processing (Jan 2024)

Att-U2Net: Using Attention to Enhance Semantic Representation for Salient Object Detection

  • Chenzhe Jiang,
  • Banglian Xu,
  • Qinghe Zheng,
  • Zhengtao Li,
  • Leihong Zhang,
  • Zimin Shen,
  • Quan Sun,
  • Dawei Zhang

DOI
https://doi.org/10.1049/sil2/6606572
Journal volume & issue
Vol. 2024

Abstract

Read online

Saliency object detection has been widely used in computer vision tasks such as image understanding, semantic segmentation, and target tracking by mimicking the human visual perceptual system to find the most visually appealing object. The U2Net model has shown good performance in salient object detection (SOD) because of its unique U-shaped residual structure and the U-shaped structural backbone incorporating feature information of different scales. However, in the U-shaped structure, the global semantic information computed from the topmost layer may be gradually interfered by the large amount of local information dilution in the top-down path, and the U-shaped residual structure has insufficient attention to the features in the salient target region of the image and will pass redundant features to the next stage. To address these two shortcomings in the U2Net model, this paper proposes improvements in two aspects: to address the situation that the global semantic information is diluted by local semantic information and the residual U-block (RSU) module pays insufficient attention to the salient regions and redundant features. An attentional gating mechanism is added to filter redundant features in the U-structure backbone. A channel attention (CA) mechanism is introduced to capture important features in the RSU module. The experimental results prove that the method proposed in this paper has higher accuracy compared to the U2Net model.