Speeding up training of automated bird recognizers by data reduction of audio features

Allan G. de Oliveira; Thiago M. Ventura; Todor D. Ganchev; Lucas N.S. Silva; Marinêz I. Marques; Karl-L. Schuchmann

doi:10.7717/peerj.8407

PeerJ (Jan 2020)

Speeding up training of automated bird recognizers by data reduction of audio features

Allan G. de Oliveira,
Thiago M. Ventura,
Todor D. Ganchev,
Lucas N.S. Silva,
Marinêz I. Marques,
Karl-L. Schuchmann

Affiliations

Allan G. de Oliveira: Computational Bioacoustics Research Unit (CO.BRA), National Institute for Science and Technology in Wetlands (INAU), Universidade Federal de Mato Grosso, Cuiabá, Mato Grosso, Brazil
Thiago M. Ventura: Computational Bioacoustics Research Unit (CO.BRA), National Institute for Science and Technology in Wetlands (INAU), Universidade Federal de Mato Grosso, Cuiabá, Mato Grosso, Brazil
Todor D. Ganchev: Computational Bioacoustics Research Unit (CO.BRA), National Institute for Science and Technology in Wetlands (INAU), Universidade Federal de Mato Grosso, Cuiabá, Mato Grosso, Brazil
Lucas N.S. Silva: Computational Bioacoustics Research Unit (CO.BRA), National Institute for Science and Technology in Wetlands (INAU), Universidade Federal de Mato Grosso, Cuiabá, Mato Grosso, Brazil
Marinêz I. Marques: Computational Bioacoustics Research Unit (CO.BRA), National Institute for Science and Technology in Wetlands (INAU), Universidade Federal de Mato Grosso, Cuiabá, Mato Grosso, Brazil
Karl-L. Schuchmann: Computational Bioacoustics Research Unit (CO.BRA), National Institute for Science and Technology in Wetlands (INAU), Universidade Federal de Mato Grosso, Cuiabá, Mato Grosso, Brazil

DOI: https://doi.org/10.7717/peerj.8407
Journal volume & issue: Vol. 8
p. e8407

Abstract

Read online Read online

Automated acoustic recognition of birds is considered an important technology in support of biodiversity monitoring and biodiversity conservation activities. These activities require processing large amounts of soundscape recordings. Typically, recordings are transformed to a number of acoustic features, and a machine learning method is used to build models and recognize the sound events of interest. The main problem is the scalability of data processing, either for developing models or for processing recordings made over long time periods. In those cases, the processing time and resources required might become prohibitive for the average user. To address this problem, we evaluated the applicability of three data reduction methods. These methods were applied to a series of acoustic feature vectors as an additional postprocessing step, which aims to reduce the computational demand during training. The experimental results obtained using Mel-frequency cepstral coefficients (MFCCs) and hidden Markov models (HMMs) support the finding that a reduction in training data by a factor of 10 does not significantly affect the recognition performance.

Published in PeerJ

ISSN: 2167-8359 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Medicine; Science: Biology (General)
Website: https://peerj.com/

About the journal

Abstract

Keywords