Remote Sensing in Ecology and Conservation (Mar 2020)
Automated identification of avian vocalizations with deep convolutional neural networks
Abstract
Abstract Passive acoustic monitoring is an emerging approach to wildlife monitoring that leverages recent improvements in automated recording units and other technologies. A central challenge of this approach is the task of locating and identifying target species vocalizations in large volumes of audio data. To address this issue, we developed an efficient data processing pipeline using a deep convolutional neural network (CNN) to automate the detection of owl vocalizations in spectrograms generated from unprocessed field recordings. While the project was initially focused on spotted and barred owls, we also trained the network to recognize northern saw‐whet owl, great horned owl, northern pygmy‐owl, and western screech‐owl. Although classification performance varies across species, initial results are promising. Recall, or the proportion of calls in the dataset that are detected and correctly identified, ranged from 63.1% for barred owl to 91.5% for spotted owl based on raw network output. Precision, the rate of true positives among apparent detections, ranged from 0.4% for spotted owl to 77.1% for northern saw‐whet owl based on raw output. In limited tests, the CNN performed as well as or better than human technicians at detecting owl calls. Our model output is suitable for developing species encounter histories for occupancy models and other analyses. We believe our approach is sufficiently general to support long‐term, large‐scale monitoring of a broad range of species beyond our target species list, including birds, mammals, and others.
Keywords