Ecological Indicators (Apr 2022)
Semantic segmentation-based whistle extraction of Indo-Pacific Bottlenose Dolphin residing at the coast of Jeju island
Abstract
Passive acoustic monitoring (PAM) is commonly utilized to monitor cetacean species’ distribution, abundance, and behavior. The demand for automated methods to detect and extract cetacean vocalizations from acoustic data has increased in the last few decades. Automatic whistle extraction of Indo-Pacific Bottlenose Dolphin (IPBD) and other whistle-producing delphinids habitating in the coastal areas represents a challenging problem due to the high ambient noise, including ship noise snapping shrimp at the same habitat. The acoustic signal containing snapping shrimp sound was usually excluded during the development of the detection method. A robust tool of bioacoustics for snapping shrimp sound is still lacking. This study trained a convolutional neural network (CNN) designed for semantic segmentation, initially developed for autonomous driving, to extract the whistle contour from spectrogram at a pixel level. A total of 1600 datasets was annotated for training and testing. As a result, the semantic segmentation classified the whistle with an overall mean Precision of 0.96, Accuracy of 0.89, and F-score of 0.86. In particular, the semantic segmentation extracted the whistle even if associated with a rich snapping shrimp sound, which the conventional method is incapable of. The advancement of metrics presented in this paper will enable long-term assessment of the IPBD population and individual or group tracking.