Scientific Reports (May 2024)

Stacked neural network for predicting polygenic risk score

  • Sun bin Kim,
  • Joon Ho Kang,
  • MyeongJae Cheon,
  • Dong Jun Kim,
  • Byung-Chul Lee

DOI
https://doi.org/10.1038/s41598-024-62513-1
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 15

Abstract

Read online

Abstract In recent years, the utility of polygenic risk scores (PRS) in forecasting disease susceptibility from genome-wide association studies (GWAS) results has been widely recognised. Yet, these models face limitations due to overfitting and the potential overestimation of effect sizes in correlated variants. To surmount these obstacles, we devised the Stacked Neural Network Polygenic Risk Score (SNPRS). This novel approach synthesises outputs from multiple neural network models, each calibrated using genetic variants chosen based on diverse p-value thresholds. By doing so, SNPRS captures a broader array of genetic variants, enabling a more nuanced interpretation of the combined effects of these variants. We assessed the efficacy of SNPRS using the UK Biobank data, focusing on the genetic risks associated with breast and prostate cancers, as well as quantitative traits like height and BMI. We also extended our analysis to the Korea Genome and Epidemiology Study (KoGES) dataset. Impressively, our results indicate that SNPRS surpasses traditional PRS models and an isolated deep neural network in terms of accuracy, highlighting its promise in refining the efficacy and relevance of PRS in genetic studies.

Keywords