IEEE Access (Jan 2024)
A Fusion of RGB Features and Local Descriptors for Object Detection in Road Scene
Abstract
Many texture descriptors have been introduced in recent years to improve texture analysis and classification outcomes, which are important in many computer vision tasks including object recognition and detection, human detector, and especially in face recognition. Local pattern is a texture descriptor that can successfully extract distinctive texture features that possesses noise and illumination variance robustness. This paper focuses on making use of local pattern features in boosting object detection models in a multi-modal fusion paradigm to acquire reliable feature maps in forward propagation throughout the network regardless of variations in photo taking conditions. We propose an adaptive fusion architecture for RGB and Local Ternary Pattern information. This architecture leverage local pattern to enrich information of original feature maps and adapt to many object detection models. Our local pattern fusion network concentrates on backbone and neck modules with an simple and efficient operation. The notable accuracy advancement is 8.03% observed in Cascade R-CNN in KITTI Dataset. In difficult conditions, our fusion models significantly lift the original performance from 4.7% to 66.3% mAP score.
Keywords