Applied Sciences (Apr 2024)

Enhancing Insect Sound Classification Using Dual-Tower Network: A Fusion of Temporal and Spectral Feature Perception

  • Hangfei He,
  • Junyang Chen,
  • Hongkun Chen,
  • Borui Zeng,
  • Yutong Huang,
  • Yudan Zhaopeng,
  • Xiaoyan Chen

DOI
https://doi.org/10.3390/app14073116
Journal volume & issue
Vol. 14, no. 7
p. 3116

Abstract

Read online

In the modern field of biological pest control, especially in the realm of insect population monitoring, deep learning methods have made further advancements. However, due to the small size and elusive nature of insects, visual detection is often impractical. In this context, the recognition of insect sound features becomes crucial. In our study, we introduce a classification module called the “dual-frequency and spectral fusion module (DFSM)”, which enhances the performance of transfer learning models in audio classification tasks. Our approach combines the efficiency of EfficientNet with the hierarchical design of the Dual Towers, drawing inspiration from the way the insect neural system processes sound signals. This enables our model to effectively capture spectral features in insect sounds and form multiscale perceptions through inter-tower skip connections. Through detailed qualitative and quantitative evaluations, as well as comparisons with leading traditional insect sound recognition methods, we demonstrate the advantages of our approach in the field of insect sound classification. Our method achieves an accuracy of 80.26% on InsectSet32, surpassing existing state-of-the-art models by 3 percentage points. Additionally, we conducted generalization experiments using three classic audio datasets. The results indicate that DFSM exhibits strong robustness and wide applicability, with minimal performance variations even when handling different input features.

Keywords