Journal of Medical Internet Research (Mar 2024)

Using Longitudinal Twitter Data for Digital Epidemiology of Childhood Health Outcomes: An Annotated Data Set and Deep Neural Network Classifiers

  • Ari Z Klein,
  • José Agustín Gutiérrez Gómez,
  • Lisa D Levine,
  • Graciela Gonzalez-Hernandez

DOI
https://doi.org/10.2196/50652
Journal volume & issue
Vol. 26
p. e50652

Abstract

Read online

We manually annotated 9734 tweets that were posted by users who reported their pregnancy on Twitter, and used them to train, evaluate, and deploy deep neural network classifiers (F1-score=0.93) to detect tweets that report having a child with attention-deficit/hyperactivity disorder (678 users), autism spectrum disorders (1744 users), delayed speech (902 users), or asthma (1255 users), demonstrating the potential of Twitter as a complementary resource for assessing associations between pregnancy exposures and childhood health outcomes on a large scale.