Mathematical Biosciences and Engineering (Jul 2023)

Automatic recognition of giant panda vocalizations using wide spectrum features and deep neural network

  • Zhiwu Liao,
  • Shaoxiang Hu,
  • Rong Hou,
  • Meiling Liu,
  • Ping Xu,
  • Zhihe Zhang,
  • Peng Chen

DOI
https://doi.org/10.3934/mbe.2023690
Journal volume & issue
Vol. 20, no. 8
pp. 15456 – 15475

Abstract

Read online

The goal of this study is to present an automatic vocalization recognition system of giant pandas (GPs). Over 12800 vocal samples of GPs were recorded at Chengdu Research Base of Giant Panda Breeding (CRBGPB) and labeled by CRBGPB animal husbandry staff. These vocal samples were divided into 16 categories, each with 800 samples. A novel deep neural network (DNN) named 3Fbank-GRU was proposed to automatically give labels to GP's vocalizations. Unlike existing human vocalization recognition frameworks based on Mel filter bank (Fbank) which used low-frequency features of voice only, we extracted the high, medium and low frequency features by Fbank and two self-deduced filter banks, named Medium Mel Filter bank (MFbank) and Reversed Mel Filter bank (RFbank). The three frequency features were sent into the 3Fbank-GRU to train and test. By training models using datasets labeled by CRBGPB animal husbandry staff and subsequent testing of trained models on recognizing tasks, the proposed method achieved recognition accuracy over 95%, which means that the automatic system can be used to accurately label large data sets of GP vocalizations collected by camera traps or other recording methods.

Keywords