IEEE Access (Jan 2020)
Multi-Level Context Aggregation Network With Channel-Wise Attention for Salient Object Detection
Abstract
Fully convolutional neural networks (FCNs) have shown their advantages in the salient object detection task. However, the prediction results do not perform well in most existing FCN-based methods, such as coarse object boundaries or even getting wrong predictions, which resulted from ignoring the difference between multi-level features during feature aggregation or underutilizing the spatial details suitable for locating boundaries. In this paper, we propose a novel end-to-end multi-level context aggregation network (MLCANet) to solve the problem mentioned-above, in which both bottom-up and top-down message passing can cooperate in a joint manner.The bottom-up process that aggregates low-level fine details features into high-level semantically-richer features would enhance high-level features, and in turn the top-down process that passes refined features from deeper layers to the shallower ones could benefit from the enhanced high-level features. Also by considering that the features from different layers may not be equally important, a multi-level feature aggregation mechanism with channel-wise attention is proposed to aggregate multi-level features by flexibly adjusting their contributions and absorbing useful information to refine themselves. The features after message passing which simultaneously encode semantic information and spatial details are used to predict saliency maps in our network. Extensive experiments demonstrate that our method can obtain high quality saliency maps with clear boundaries, and perform favorably against the state-of-the-art methods without any pre-processing and post-processing.
Keywords