Elephant Sound Classification Using Deep Learning Optimization

Hiruni Dewmini; Dulani Meedeniya; Charith Perera

doi:10.3390/s25020352

Sensors (Jan 2025)

Elephant Sound Classification Using Deep Learning Optimization

Hiruni Dewmini,
Dulani Meedeniya,
Charith Perera

Affiliations

Hiruni Dewmini: Department of Computer Science and Engineering, University of Moratuwa, Moratuwa 10400, Sri Lanka
Dulani Meedeniya: Department of Computer Science and Engineering, University of Moratuwa, Moratuwa 10400, Sri Lanka
Charith Perera: School of Computer Science and Informatics, Cardiff University, Cardiff CF24 3AA, UK

DOI: https://doi.org/10.3390/s25020352
Journal volume & issue: Vol. 25, no. 2
p. 352

Abstract

Read online

Elephant sound identification is crucial in wildlife conservation and ecological research. The identification of elephant vocalizations provides insights into the behavior, social dynamics, and emotional expressions, leading to elephant conservation. This study addresses elephant sound classification utilizing raw audio processing. Our focus lies on exploring lightweight models suitable for deployment on resource-costrained edge devices, including MobileNet, YAMNET, and RawNet, alongside introducing a novel model termed ElephantCallerNet. Notably, our investigation reveals that the proposed ElephantCallerNet achieves an impressive accuracy of 89% in classifying raw audio directly without converting it to spectrograms. Leveraging Bayesian optimization techniques, we fine-tuned crucial parameters such as learning rate, dropout, and kernel size, thereby enhancing the model’s performance. Moreover, we scrutinized the efficacy of spectrogram-based training, a prevalent approach in animal sound classification. Through comparative analysis, the raw audio processing outperforms spectrogram-based methods. In contrast to other models in the literature that primarily focus on a single caller type or binary classification that identifies whether a sound is an elephant voice or not, our solution is designed to classify three distinct caller-types namely roar, rumble, and trumpet.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords