Gong-kuang zidonghua (Jan 2024)

A coal foreign object detection method based on cross modal attention fusion

  • CAO Xiangang,
  • LI Hu,
  • WANG Peng,
  • WU Xudong,
  • XIANG Jingfang,
  • DING Wentao

DOI
https://doi.org/10.13272/j.issn.1671-251x.2023110035
Journal volume & issue
Vol. 50, no. 1
pp. 57 – 65

Abstract

Read online

The RGB image of coal foreign objects lacks target space and edge information, the color and texture between the object to be detected and the background are similar, the contrast is low, and there are overlapping and occlusion phenomena among the objects to be detected, resulting in insufficient feature extraction of coal foreign objects, and the existing foreign object detection methods are difficult to achieve ideal results. In order to solve the above problems, a coal foreign object detection method based on cross modal attention fusion is proposed. By introducing Depth images to construct a dual feature pyramid network (DFPN) for RGB images and Depth images, a shallow feature extraction strategy is adopted to extract low-level features of Depth images. Basic features such as deep edges and deep textures are used to assist deep features of RGB images, effectively obtaining complementary information between the two features. It thereby enriches the spatial and edge information of foreign object features and improves detection precision. A cross modal attention fusion module (CAFM) based on coordinate attention and improved spatial attention is constructed to synergistically optimize and fuse RGB features and Depth features. It enhances the network's attention to the visible parts of occluded foreign objects in the feature map, and improves the precision of occluded foreign object detection. Finally, regional convolutional neural network (R-CNN) is used to output the classification, regression, and segmentation results of coal foreign objects. The experimental results show that in terms of detection precision, the average segmentation precision AP of the proposed method is 3.9% higher than the better Mask transformer in the two-stage model. In terms of detection efficiency, the proposed method has a single frame detection time of 110.5 ms, which can meet the real-time requirements of foreign object detection. The coal foreign object detection method based on cross modal attention fusion can assist color, shape, and texture features with spatial features. It accurately recognizes the differences between coal foreign objects and between coal foreign objects and conveyor belts, effectively improves the detection precision of complex feature foreign objects. It reduces false alarms and missed detections, and achieves precise detection and pixel level segmentation of coal foreign objects under complex features.

Keywords