IEEE Access (Jan 2024)
ReDeformTR: Wildlife Re-Identification Based on Light-Weight Deformable Transformer With Multi-Image Feature Fusion
Abstract
Wildlife re-identification (Re-ID) techniques are key to animal tracking and preservation. However, the performance of current deep learning methods is unsatisfactory in cross-camera scenarios, especially in terms of mean average precision (mAP). This work introduces ReDeformTR, a novel model designed for wildlife Re-ID tasks, particularly focusing on the identification of individual animals’ images captured by different cameras. ReDeformTR integrates a lightweight deformable transformer architecture capable of multi-image feature fusion, which can extract and fuse features from multiple images and scales, facilitating efficient representation of individual animals and enhancing performance during queries. A convolutional neural network (CNN) backbone is adopted for feature extraction, while a deformable transformer is used for feature refinement and fusion. The deformable attention mechanism reduces computation overhead by selectively sampling features, thereby enhancing efficiency. The experiments show that ReDeformTR demonstrates superior performance in terms of mAP on a cross-camera wildlife dataset in ATRW. The mAP is 84.98%, which represents a significant improvement of 12.29% compared to the state-of-the-art model (PPGNet). Furthermore, our model achieves a significant reduction in model parameter size, positioning it as a promising solution for wildlife Re-ID tasks.
Keywords