Frontiers in Remote Sensing (Aug 2024)

Automatic detection of unidentified fish sounds: a comparison of traditional machine learning with deep learning

  • Xavier Mouy,
  • Stephanie K. Archer,
  • Stan Dosso,
  • Sarah Dudas,
  • Philina English,
  • Colin Foord,
  • William Halliday,
  • William Halliday,
  • Francis Juanes,
  • Darienne Lancaster,
  • Sofie Van Parijs,
  • Dana Haggarty,
  • Dana Haggarty

DOI
https://doi.org/10.3389/frsen.2024.1439995
Journal volume & issue
Vol. 5

Abstract

Read online

Many species of fishes around the world are soniferous. The types of sounds fishes produce vary among species and regions but consist typically of low-frequency (<1.5 kHz) pulses and grunts. These sounds can potentially be used to monitor fishes non-intrusively and could complement traditional monitoring techniques. However, the significant time required for human analysts to manually label fish sounds in acoustic recordings does not yet allow passive acoustics to be used as a viable tool for monitoring fishes. In this paper, we compare two different approaches to automatically detect fish sounds. One is a more traditional machine learning technique based on the detection of acoustic transients in the spectrogram and the classification using Random Forest (RF). The other is using a deep learning approach and is based on the classification of overlapping segments (0.2 s) of spectrogram using a ResNet18 Convolutional Neural Network (CNN). Both algorithms were trained using 21,950 manually annotated fish and non-fish sounds collected from 2014 to 2019 at five different locations in the Strait of Georgia, British Columbia, Canada. The performance of the detectors was tested on part of the data from the Strait of Georgia that was withheld from the training phase, data from Barkley Sound, British Columbia, and data collected in the Port of Miami, Florida, United States. The CNN performed up to 1.9 times better than the RF (F1 score: 0.82 vs. 0.43). In some cases, the CNN was able to find more faint fish sounds than the analyst and performed well in environments different from the one it was trained in (Miami F1 score: 0.88). Noise analysis in the 20–1,000 Hz frequency band shows that the CNN is still reliable in noise levels greater than 130 dB re 1 μPa in the Port of Miami but becomes less reliable in Barkley Sound past 100 dB re 1 μPa due to mooring noise. The proposed approach can efficiently monitor (unidentified) fish sounds in a variety of environments and can also facilitate the development of species-specific detectors. We provide the software FishSound Finder, an easy-to-use open-source implementation of the CNN detector with detailed documentation.

Keywords