iScience (Jun 2022)

A graph-embedded topic model enables characterization of diverse pain phenotypes among UK biobank individuals

  • Yuening Wang,
  • Rodrigo Benavides,
  • Luda Diatchenko,
  • Audrey V. Grant,
  • Yue Li

Journal volume & issue
Vol. 25, no. 6
p. 104390

Abstract

Read online

Summary: Large biobank repositories of clinical conditions and medications data open opportunities to investigate the phenotypic disease network. We present a graph embedded topic model (GETM). We integrate existing biomedical knowledge graph information in the form of pre-trained graph embedding into the embedded topic model. Via a variational autoencoder framework, we infer patient phenotypic mixture by modeling multi-modal discrete patient medical records. We applied GETM to UK Biobank (UKB) self-reported clinical phenotype data, which contains 443 self-reported medical conditions and 802 medications for 457,461 individuals. Compared to existing methods, GETM demonstrates good imputation performance. With a more focused application on characterizing pain phenotypes, we observe that GETM-inferred phenotypes not only accurately predict the status of chronic musculoskeletal (CMK) pain but also reveal known pain-related topics. Intriguingly, medications and conditions in the cardiovascular category are enriched among the most predictive topics of chronic pain.

Keywords