Symmetry (Jul 2024)

A Symmetric Efficient Spatial and Channel Attention (ESCA) Module Based on Convolutional Neural Networks

  • Huaiyu Liu,
  • Yueyuan Zhang,
  • Yiyang Chen

DOI
https://doi.org/10.3390/sym16080952
Journal volume & issue
Vol. 16, no. 8
p. 952

Abstract

Read online

In recent years, attention mechanisms have shown great potential in various computer vision tasks. However, most existing methods focus on developing more complex attention modules for better performance, which inevitably increases the complexity of the model. To overcome performance and complexity tradeoffs, this paper proposes efficient spatial and channel attention (ESCA), a symmetric, comprehensive, and efficient attention module. By analyzing squeeze-and-excitation (SE), convolutional block attention module (CBAM), coordinate attention (CA), and efficient channel attention (ECA) modules, we abandon the dimension-reduction operation of SE module, verify the negative impact of global max pooling (GMP) on the model, and apply a local cross-channel interaction strategy without dimension reduction to learn attention. We not only care about the channel features of the image, we also care about the spatial location of the target on the image, and we take into account the effectiveness of channel attention, so we designed the symmetric ESCA module. The ESCA module is effective, as demonstrated by its application in the ResNet-50 classification benchmark. With 26.26 M parameters and 8.545 G FLOPs, it introduces a mere 0.14% increment in FLOPs while achieving over 6.33% improvement in Top-1 accuracy and exceeding 3.25% gain in Top-5 accuracy. We perform image classification and object detection tasks on ResNet, MobileNet, YOLO, and other architectures on popular datasets such as Mini ImageNet, CIFAR-10, and VOC 2007. Experiments show that ESCA can achieve great improvement in model accuracy at a very small cost, and it performs well among similar models.

Keywords