Low-dimensional learned feature spaces quantify individual and group differences in vocal repertoires

Jack Goffinet; Samuel Brudner; Richard Mooney; John Pearson

doi:10.7554/elife.67855

eLife (May 2021)

Low-dimensional learned feature spaces quantify individual and group differences in vocal repertoires

Jack Goffinet,
Samuel Brudner,
Richard Mooney,
John Pearson

Affiliations

Jack Goffinet: ORCiD; Department of Computer Science, Duke University, Durham, United States; Center for Cognitive Neurobiology, Duke University, Durham, United States; Department of Neurobiology, Duke University, Durham, United States
Samuel Brudner: ORCiD; Department of Neurobiology, Duke University, Durham, United States
Richard Mooney: ORCiD; Department of Neurobiology, Duke University, Durham, United States
John Pearson: ORCiD; Center for Cognitive Neurobiology, Duke University, Durham, United States; Department of Neurobiology, Duke University, Durham, United States; Department of Biostatistics & Bioinformatics, Duke University, Durham, United States; Department of Electrical and Computer Engineering, Duke University, Durham, United States

DOI: https://doi.org/10.7554/elife.67855
Journal volume & issue: Vol. 10

Abstract

Read online

Increases in the scale and complexity of behavioral data pose an increasing challenge for data analysis. A common strategy involves replacing entire behaviors with small numbers of handpicked, domain-specific features, but this approach suffers from several crucial limitations. For example, handpicked features may miss important dimensions of variability, and correlations among them complicate statistical testing. Here, by contrast, we apply the variational autoencoder (VAE), an unsupervised learning method, to learn features directly from data and quantify the vocal behavior of two model species: the laboratory mouse and the zebra finch. The VAE converges on a parsimonious representation that outperforms handpicked features on a variety of common analysis tasks, enables the measurement of moment-by-moment vocal variability on the timescale of tens of milliseconds in the zebra finch, provides strong evidence that mouse ultrasonic vocalizations do not cluster as is commonly believed, and captures the similarity of tutor and pupil birdsong with qualitatively higher fidelity than previous approaches. In all, we demonstrate the utility of modern unsupervised learning approaches to the quantification of complex and high-dimensional vocal behavior.

Published in eLife

ISSN: 2050-084X (Online)
Publisher: eLife Sciences Publications Ltd
Country of publisher: United Kingdom
LCC subjects: Medicine; Science: Biology (General)
Website: https://elifesciences.org

About the journal

Abstract

Keywords