ReDeformTR: Wildlife Re-Identification Based on Light-Weight Deformable Transformer With Multi-Image Feature Fusion

Zitong Li; Zhengmao Yan; Weihong Tian; Detian Zeng; Yi Liu; Weimin Li

doi:10.1109/access.2024.3436813

IEEE Access (Jan 2024)

ReDeformTR: Wildlife Re-Identification Based on Light-Weight Deformable Transformer With Multi-Image Feature Fusion

Zitong Li,
Zhengmao Yan,
Weihong Tian,
Detian Zeng,
Yi Liu,
Weimin Li

Affiliations

Zitong Li: ORCiD; School of Information, Hunan University of Humanities, Science and Technology, Loudi, China
Zhengmao Yan: ORCiD; School of Information, Hunan University of Humanities, Science and Technology, Loudi, China
Weihong Tian: ORCiD; School of Information, Hunan University of Humanities, Science and Technology, Loudi, China
Detian Zeng: School of Information, Hunan University of Humanities, Science and Technology, Loudi, China
Yi Liu: School of Information, Hunan University of Humanities, Science and Technology, Loudi, China
Weimin Li: ORCiD; School of Information, Hunan University of Humanities, Science and Technology, Loudi, China

DOI: https://doi.org/10.1109/access.2024.3436813
Journal volume & issue: Vol. 12
pp. 106321 – 106332

Abstract

Read online

Wildlife re-identification (Re-ID) techniques are key to animal tracking and preservation. However, the performance of current deep learning methods is unsatisfactory in cross-camera scenarios, especially in terms of mean average precision (mAP). This work introduces ReDeformTR, a novel model designed for wildlife Re-ID tasks, particularly focusing on the identification of individual animals’ images captured by different cameras. ReDeformTR integrates a lightweight deformable transformer architecture capable of multi-image feature fusion, which can extract and fuse features from multiple images and scales, facilitating efficient representation of individual animals and enhancing performance during queries. A convolutional neural network (CNN) backbone is adopted for feature extraction, while a deformable transformer is used for feature refinement and fusion. The deformable attention mechanism reduces computation overhead by selectively sampling features, thereby enhancing efficiency. The experiments show that ReDeformTR demonstrates superior performance in terms of mAP on a cross-camera wildlife dataset in ATRW. The mAP is 84.98%, which represents a significant improvement of 12.29% compared to the state-of-the-art model (PPGNet). Furthermore, our model achieves a significant reduction in model parameter size, positioning it as a promising solution for wildlife Re-ID tasks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords