DCNN for Pig Vocalization and Non-Vocalization Classification: Evaluate Model Robustness with New Data

Vandet Pann; Kyeong-seok Kwon; Byeonghyeon Kim; Dong-Hwa Jang; Jong-Bok Kim

doi:10.3390/ani14142029

Animals (Jul 2024)

DCNN for Pig Vocalization and Non-Vocalization Classification: Evaluate Model Robustness with New Data

Vandet Pann,
Kyeong-seok Kwon,
Byeonghyeon Kim,
Dong-Hwa Jang,
Jong-Bok Kim

Affiliations

Vandet Pann: Animal Environment Division, National Institute of Animal Science, Rural Development Administration, Wanju 55365, Republic of Korea
Kyeong-seok Kwon: Animal Environment Division, National Institute of Animal Science, Rural Development Administration, Wanju 55365, Republic of Korea
Byeonghyeon Kim: Animal Environment Division, National Institute of Animal Science, Rural Development Administration, Wanju 55365, Republic of Korea
Dong-Hwa Jang: Animal Environment Division, National Institute of Animal Science, Rural Development Administration, Wanju 55365, Republic of Korea
Jong-Bok Kim: Animal Environment Division, National Institute of Animal Science, Rural Development Administration, Wanju 55365, Republic of Korea

DOI: https://doi.org/10.3390/ani14142029
Journal volume & issue: Vol. 14, no. 14
p. 2029

Abstract

Read online

Since pig vocalization is an important indicator of monitoring pig conditions, pig vocalization detection and recognition using deep learning play a crucial role in the management and welfare of modern pig livestock farming. However, collecting pig sound data for deep learning model training takes time and effort. Acknowledging the challenges of collecting pig sound data for model training, this study introduces a deep convolutional neural network (DCNN) architecture for pig vocalization and non-vocalization classification with a real pig farm dataset. Various audio feature extraction methods were evaluated individually to compare the performance differences, including Mel-frequency cepstral coefficients (MFCC), Mel-spectrogram, Chroma, and Tonnetz. This study proposes a novel feature extraction method called Mixed-MMCT to improve the classification accuracy by integrating MFCC, Mel-spectrogram, Chroma, and Tonnetz features. These feature extraction methods were applied to extract relevant features from the pig sound dataset for input into a deep learning network. For the experiment, three datasets were collected from three actual pig farms: Nias, Gimje, and Jeongeup. Each dataset consists of 4000 WAV files (2000 pig vocalization and 2000 pig non-vocalization) with a duration of three seconds. Various audio data augmentation techniques are utilized in the training set to improve the model performance and generalization, including pitch-shifting, time-shifting, time-stretching, and background-noising. In this study, the performance of the predictive deep learning model was assessed using the k-fold cross-validation (k = 5) technique on each dataset. By conducting rigorous experiments, Mixed-MMCT showed superior accuracy on Nias, Gimje, and Jeongeup, with rates of 99.50%, 99.56%, and 99.67%, respectively. Robustness experiments were performed to prove the effectiveness of the model by using two farm datasets as a training set and a farm as a testing set. The average performance of the Mixed-MMCT in terms of accuracy, precision, recall, and F1-score reached rates of 95.67%, 96.25%, 95.68%, and 95.96%, respectively. All results demonstrate that the proposed Mixed-MMCT feature extraction method outperforms other methods regarding pig vocalization and non-vocalization classification in real pig livestock farming.

Published in Animals

ISSN: 2076-2615 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Agriculture: Animal culture: Veterinary medicine; Science: Zoology
Website: http://www.mdpi.com/journal/animals/

About the journal

Abstract

Keywords