Prediction of adverse drug reactions using demographic and non-clinical drug characteristics in FAERS data

Alireza Farnoush; Zahra Sedighi-Maman; Behnam Rasoolian; Jonathan J. Heath; Banafsheh Fallah

doi:10.1038/s41598-024-74505-2

Scientific Reports (Oct 2024)

Prediction of adverse drug reactions using demographic and non-clinical drug characteristics in FAERS data

Alireza Farnoush,
Zahra Sedighi-Maman,
Behnam Rasoolian,
Jonathan J. Heath,
Banafsheh Fallah

Affiliations

Alireza Farnoush: Darla Moore School of Business, University of South Carolina
Zahra Sedighi-Maman: Robert B. Willumstad School of Business, Adelphi University
Behnam Rasoolian: Department of Industrial and System Engineering, Auburn University
Jonathan J. Heath: School of Business, St. Bonaventure University, St Bonaventure
Banafsheh Fallah: Department of Industrial and System Engineering, Auburn University

DOI: https://doi.org/10.1038/s41598-024-74505-2
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 11

Abstract

Read online

Abstract The presence of adverse drug reactions (ADRs) is an ongoing public health concern. While traditional methods to discover ADRs are very costly and limited, it is prudent to predict ADRs through non-invasive methods such as machine learning based on existing data. Although various studies exist regarding ADR prediction using non-clinical data, a process that leverages both demographic and non-clinical data for ADR prediction is missing. In addition, the importance of individual features in ADR prediction has yet to be fully explored. This study aims to develop an ADR prediction model based on demographic and non-clinical data, where we identify the highest contributing factors. We focus our efforts on 30 common and severe ADRs reported to the Food and Drug Administration (FDA) between 2012 and 2023. We have developed a random forest (RF) and deep learning (DL) machine learning model that ingests demographic data (e.g., Age and Gender of patients) and non-clinical data, which includes chemical, molecular, and biological drug characteristics. We successfully unified both demographic and non-clinical data sources within a complete dataset regarding ADR prediction. Model performances were assessed via the area under the receiver operating characteristic curve (AUC) and the mean average precision (MAP). We demonstrated that our parsimonious models, which include only the top 20 most important features comprising 5 demographic features and 15 non-clinical features (13 molecular and 2 biological), achieve ADR prediction performance comparable to a less practical, feature-rich model consisting of all 2,315 features. Specifically, our models achieved an AUC of 0.611 and 0.674 for RF and DL algorithms, respectively. We hope our research provides researchers and clinicians with valuable insights and facilitates future research designs by identifying top ADR predictors (including demographic information) and practical parsimonious models.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords