Detecting Camouflaged Objects via Multi-Stage Coarse-to-Fine Refinement

Yuye Wang; Tianyou Chen; Xiaoguang Hu; Jiaqi Shi; Zichong Jia

doi:10.1109/ACCESS.2024.3380893

IEEE Access (Jan 2024)

Detecting Camouflaged Objects via Multi-Stage Coarse-to-Fine Refinement

Yuye Wang,
Tianyou Chen,
Xiaoguang Hu,
Jiaqi Shi,
Zichong Jia

Affiliations

Yuye Wang: College of Physics and Information Engineering, Minnan Normal University, Zhangzhou, China
Tianyou Chen: ORCiD; Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan, China
Xiaoguang Hu: ORCiD; State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
Jiaqi Shi: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
Zichong Jia: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2024.3380893
Journal volume & issue: Vol. 12
pp. 44055 – 44068

Abstract

Read online

Camouflaged objects are typically assimilated into their surroundings. Consequently, in contrast to generic object detection/segmentation, camouflaged object detection proves to be considerably more intricate due to the indistinct boundaries and heightened intrinsic similarities between foreground targets and the surrounding environment. Despite the proposition of numerous algorithms that have demonstrated commendable performance across various scenarios, these approaches may still grapple with blurred boundaries, leading to the inadvertent omission of camouflaged targets in challenging scenes. In this paper, we introduce a multi-stage framework tailored for segmenting camouflaged objects through a process of coarse-to-fine refinement. Specifically, our network encompasses three distinct decoders, each fulfilling a unique role in the model. In the initial decoder, we introduce the Bi-directional Locating Module to excavate foreground and background cues, enhancing target localization. The second decoder focuses on leveraging boundary information to augment overall performance, incorporating the Multi-level Feature Fusion Module to generate prediction maps with finer boundaries. Subsequently, the third decoder introduces the Mask-guided Fusion Module, designed to process high-resolution features under the guidance of the second decoder’s results. This approach enables the preservation of structural details and the generation of fine-grained prediction maps. Through the integration of the three decoders, our model effectively identifies and segments camouflaged targets. Extensive experiments are conducted on three commonly used benchmark datasets. The results of these experiments demonstrate that, even without the application of pre-processing or post-processing techniques, our model outperforms 14 state-of-the-art algorithms.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords