Remote Sensing (Jan 2022)
Unlocking the Potential of Deep Learning for Migratory Waterbirds Monitoring Using Surveillance Video
Abstract
Estimates of migratory waterbirds population provide the essential scientific basis to guide the conservation of coastal wetlands, which are heavily modified and threatened by economic development. New equipment and technology have been increasingly introduced in protected areas to expand the monitoring efforts, among which video surveillance and other unmanned devices are widely used in coastal wetlands. However, the massive amount of video records brings the dual challenge of storage and analysis. Manual analysis methods are time-consuming and error-prone, representing a significant bottleneck to rapid data processing and dissemination and application of results. Recently, video processing with deep learning has emerged as a solution, but its ability to accurately identify and count waterbirds across habitat types (e.g., mudflat, saltmarsh, and open water) is untested in coastal environments. In this study, we developed a two-step automatic waterbird monitoring framework. The first step involves automatic video segmentation, selection, processing, and mosaicking video footages into panorama images covering the entire monitoring area, which are subjected to the second step of counting and density estimation using a depth density estimation network (DDE). We tested the effectiveness and performance of the framework in Tiaozini, Jiangsu Province, China, which is a restored wetland, providing key high-tide roosting ground for migratory waterbirds in the East Asian–Australasian flyway. The results showed that our approach achieved an accuracy of 85.59%, outperforming many other popular deep learning algorithms. Furthermore, the standard error of our model was very small (se = 0.0004), suggesting the high stability of the method. The framework is computing effective—it takes about one minute to process a theme covering the entire site using a high-performance desktop computer. These results demonstrate that our framework can extract ecologically meaningful data and information from video surveillance footages accurately to assist biodiversity monitoring, fulfilling the gap in the efficient use of existing monitoring equipment deployed in protected areas.
Keywords