Health Data Science (Jan 2023)

#ChronicPain: Automated Building of a Chronic Pain Cohort from Twitter Using Machine Learning

  • Abeed Sarker,
  • Sahithi Lakamana,
  • Yuting Guo,
  • Yao Ge,
  • Abimbola Leslie,
  • Omolola Okunromade,
  • Elena Gonzalez-Polledo,
  • Jeanmarie Perrone,
  • Anne Marie McKenzie-Brown

DOI
https://doi.org/10.34133/hds.0078
Journal volume & issue
Vol. 3

Abstract

Read online

Background: Due to the high burden of chronic pain, and the detrimental public health consequences of its treatment with opioids, there is a high-priority need to identify effective alternative therapies. Social media is a potentially valuable resource for knowledge about self-reported therapies by chronic pain sufferers. Methods: We attempted to (a) verify the presence of large-scale chronic pain-related chatter on Twitter, (b) develop natural language processing and machine learning methods for automatically detecting self-disclosures, (c) collect longitudinal data posted by them, and (d) semiautomatically analyze the types of chronic pain-related information reported by them. We collected data using chronic pain-related hashtags and keywords and manually annotated 4,998 posts to indicate if they were self-reports of chronic pain experiences. We trained and evaluated several state-of-the-art supervised text classification models and deployed the best-performing classifier. We collected all publicly available posts from detected cohort members and conducted manual and natural language processing-driven descriptive analyses. Results: Interannotator agreement for the binary annotation was 0.82 (Cohen’s kappa). The RoBERTa model performed best (F1 score: 0.84; 95% confidence interval: 0.80 to 0.89), and we used this model to classify all collected unlabeled posts. We discovered 22,795 self-reported chronic pain sufferers and collected over 3 million of their past posts. Further analyses revealed information about, but not limited to, alternative treatments, patient sentiments about treatments, side effects, and self-management strategies. Conclusion: Our social media based approach will result in an automatically growing large cohort over time, and the data can be leveraged to identify effective opioid-alternative therapies for diverse chronic pain types.