Avian Conservation and Ecology (Dec 2017)

Recommendations for acoustic recognizer performance assessment with application to five common automated signal recognition programs

  • Elly C. Knight,
  • Kevin C. Hannah,
  • Gabriel J. Foley,
  • Chris D. Scott,
  • R. Mark. Brigham,
  • Erin Bayne

DOI
https://doi.org/10.5751/ACE-01114-120214
Journal volume & issue
Vol. 12, no. 2
p. 14

Abstract

Read online

Automated signal recognition software is increasingly used to extract species detection data from acoustic recordings collected using autonomous recording units (ARUs), but there is little practical guidance available for ecologists on the application of this technology. Performance evaluation is an important part of employing automated acoustic recognition technology because the resulting data quality can vary with a variety of factors. We reviewed the bioacoustic literature to summarize performance evaluation and found little consistency in evaluation, metrics employed, or terminology used. We also found that few studies examined how score threshold, i.e., cut-off for the level of confidence in target species classification, affected performance, but those that did showed a strong impact of score threshold on performance. We used the lessons learned from our literature review and best practices from the field of machine learning to evaluate the performance of five readily-available automated signal recognition programs. We used the Common Nighthawk (Chordeiles minor) as our model species because it has simple, consistent, and frequent vocalizations. We found that automated signal recognition was effective for determining Common Nighthawk presence-absence and call rate, particularly at low score thresholds, but that occupancy estimates from the data processed with recognizers were consistently lower than from data generated by human listening and became unstable at high score thresholds. Of the five programs evaluated, our convolutional neural network (CNN) recognizer performed best, with recognizers built in Song Scope and MonitoR also performing well. The RavenPro and Kaleidoscope recognizers were moderately effective, but produced more false positives than the other recognizers. Finally, we synthesized six general recommendations for ecologists who employ automated signal recognition software, including what to use as a test benchmark, how to incorporate score threshold, what metrics to use, and how to evaluate efficiency. Future studies should consider our recommendations to build a body of literature on the effectiveness of this technology for avian research and monitoring.

Keywords