International Journal of Applied Earth Observations and Geoinformation (May 2023)
Semantic segmentation guided pseudo label mining and instance re-detection for weakly supervised object detection in remote sensing images
Abstract
Weakly supervised object detection (WSOD) in remote sensing images (RSIs) has good practical value because it only requires the image-level annotations. The existing methods usually have two problems. The first problem is that many methods mine the pseudo ground truth (PGT) instances solely depending on the class confidence score (CCS), however, the reliability of CCS is not enough because of the inter-class similarity and intra-class diversity in RSIs, consequently, the reliability of corresponding PGT instances is limited, in addition, the most discriminative part with high CCS rather than the whole object is easily selected as the PGT instance. The second problem is that the object localization solely relies on the candidate proposals generated by the selective search or edge boxes algorithm, however, the localization accuracy of the candidate proposals is not enough because of the cluttered background in RSIs. To address the first problem, a semantic segmentation guided pseudo label mining (SGPLM) module is proposed, which uses a novel metric named class-specific object confidence score (COCS) to mine high-quality PGT instances. The COCS is made up of the CCS and class-specific object overlap score (COOS) which is calculated through the weakly supervised semantic segmentation. The mined PGT instances are more robust and incline to cover the whole object by combining the COOS. To handle the second problem, an instance re-detection (IR) module is proposed for improving the localization accuracy of the WSOD model, in which an enhanced PGT instance generation strategy is designed to obtain the enhanced PGT instances on the basis of the candidate proposals, and the enhanced PGT instances are used to train the instance re-classification and re-localization branches which are jointly utilized to infer the final results. The ablation studies validate the effectiveness of the SGPLM and IR modules. The comprehensive comparisons with other advanced methods show that the performance of the proposed method is state-of-the-art on two RSI datasets.