Animals (Jun 2024)

Utilizing Geographical Distribution Statistical Data to Improve Zero-Shot Species Recognition

  • Lei Liu,
  • Boxun Han,
  • Feixiang Chen,
  • Chao Mou,
  • Fu Xu

DOI
https://doi.org/10.3390/ani14121716
Journal volume & issue
Vol. 14, no. 12
p. 1716

Abstract

Read online

Species recognition is a crucial part of understanding the abundance and distribution of various organisms and is important for biodiversity conservation and management. Traditional vision-based deep learning-driven species recognition requires large amounts of well-labeled, high-quality image data, the collection of which is challenging for rare and endangered species. In addition, recognition methods designed based on specific species have poor generalization ability and are difficult to adapt to new species recognition scenarios. To address these issues, zero-shot species recognition based on Contrastive Language–Image Pre-training (CLIP) has become a research hotspot. However, previous studies have primarily utilized visual descriptive information and taxonomic information of species to improve zero-shot recognition performance, and the use of geographic distribution characteristics of species to improve zero-shot recognition performance has not been explored. To fill this gap, we proposed a CLIP-driven zero-shot species recognition method that incorporates knowledge of the geographic distribution of species. First, we designed three prompts based on the species geographic distribution statistical data. Then, the latitude and longitude coordinate information attached to each image in the species dataset was converted into addresses, and they were integrated together to form the geographical distribution knowledge of each species. Finally, species recognition results were derived by calculating the similarity after acquiring features by the trained CLIP image encoder and text encoder. We conducted extensive experiments on multiple species datasets from the iNaturalist 2021 dataset, where the zero-shot recognition accuracies of mammals, mollusks, reptiles, amphibians, birds, and insects were 44.96%, 15.27%, 17.51%, 9.47%, 28.35%, and 7.03%, an improvement of 2.07%, 0.48%, 0.35%, 1.12%, 1.64%, and 0.61%, respectively, as compared to CLIP with default prompt. The experimental results show that the fusion of geographic distribution statistical data can effectively improve the performance of zero-shot species recognition, which provides a new way to utilize species domain knowledge.

Keywords