Department of Neurobiology, Harvard Medical School, Boston, United States
Nivanthika K Wimalasena
Department of Neurobiology, Harvard Medical School, Boston, United States; F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, United States
Kelsey J Clausing
Department of Molecular Biology, Massachusetts General Hospital, Boston, United States; Department of Genetics, Harvard Medical School, Boston, United States
Yu Y Dai
Department of Molecular Biology, Massachusetts General Hospital, Boston, United States; Department of Genetics, Harvard Medical School, Boston, United States
David A Yarmolinsky
Department of Neurobiology, Harvard Medical School, Boston, United States; F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, United States
Tomás Cruz
Champalimaud Neuroscience Programme, Champalimaud Center for the Unknown, Lisbon, Portugal
Adam D Kashlan
Department of Neurobiology, Harvard Medical School, Boston, United States; F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, United States
Champalimaud Neuroscience Programme, Champalimaud Center for the Unknown, Lisbon, Portugal
Lauren L Orefice
Department of Molecular Biology, Massachusetts General Hospital, Boston, United States; Department of Genetics, Harvard Medical School, Boston, United States
Clifford J Woolf
Department of Neurobiology, Harvard Medical School, Boston, United States; F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, United States
Videos of animal behavior are used to quantify researcher-defined behaviors of interest to study neural function, gene mutations, and pharmacological therapies. Behaviors of interest are often scored manually, which is time-consuming, limited to few behaviors, and variable across researchers. We created DeepEthogram: software that uses supervised machine learning to convert raw video pixels into an ethogram, the behaviors of interest present in each video frame. DeepEthogram is designed to be general-purpose and applicable across species, behaviors, and video-recording hardware. It uses convolutional neural networks to compute motion, extract features from motion and images, and classify features into behaviors. Behaviors are classified with above 90% accuracy on single frames in videos of mice and flies, matching expert-level human performance. DeepEthogram accurately predicts rare behaviors, requires little training data, and generalizes across subjects. A graphical interface allows beginning-to-end analysis without end-user programming. DeepEthogram’s rapid, automatic, and reproducible labeling of researcher-defined behaviors of interest may accelerate and enhance supervised behavior analysis. Code is available at: https://github.com/jbohnslav/deepethogram.