BMC Medical Informatics and Decision Making (Jun 2023)

Using machine learning to develop a clinical prediction model for SSRI-associated bleeding: a feasibility study

  • Jatin Goyal,
  • Ding Quan Ng,
  • Kevin Zhang,
  • Alexandre Chan,
  • Joyce Lee,
  • Kai Zheng,
  • Keri Hurley-Kim,
  • Lee Nguyen,
  • Lu He,
  • Megan Nguyen,
  • Sarah McBane,
  • Wei Li,
  • Christine Luu Cadiz

DOI
https://doi.org/10.1186/s12911-023-02206-3
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Introduction Adverse drug events (ADEs) are associated with poor outcomes and increased costs but may be prevented with prediction tools. With the National Institute of Health All of Us (AoU) database, we employed machine learning (ML) to predict selective serotonin reuptake inhibitor (SSRI)-associated bleeding. Methods The AoU program, beginning in 05/2018, continues to recruit ≥ 18 years old individuals across the United States. Participants completed surveys and consented to contribute electronic health record (EHR) for research. Using the EHR, we determined participants who were exposed to SSRIs (citalopram, escitalopram, fluoxetine, fluvoxamine, paroxetine, sertraline, vortioxetine). Features (n = 88) were selected with clinicians’ input and comprised sociodemographic, lifestyle, comorbidities, and medication use information. We identified bleeding events with validated EHR algorithms and applied logistic regression, decision tree, random forest, and extreme gradient boost to predict bleeding during SSRI exposure. We assessed model performance with area under the receiver operating characteristic curve statistic (AUC) and defined clinically significant features as resulting in > 0.01 decline in AUC after removal from the model, in three of four ML models. Results There were 10,362 participants exposed to SSRIs, with 9.6% experiencing a bleeding event during SSRI exposure. For each SSRI, performance across all four ML models was relatively consistent. AUCs from the best models ranged 0.632–0.698. Clinically significant features included health literacy for escitalopram, and bleeding history and socioeconomic status for all SSRIs. Conclusions We demonstrated feasibility of predicting ADEs using ML. Incorporating genomic features and drug interactions with deep learning models may improve ADE prediction.

Keywords