Machine learning models for predicting blood pressure phenotypes by combining multiple polygenic risk scores

Yana Hrytsenko; Benjamin Shea; Michael Elgart; Nuzulul Kurniansyah; Genevieve Lyons; Alanna C. Morrison; April P. Carson; Bernhard Haring; Braxton D. Mitchell; Bruce M. Psaty; Byron C. Jaeger; C. Charles Gu; Charles Kooperberg; Daniel Levy; Donald Lloyd-Jones; Eunhee Choi; Jennifer A. Brody; Jennifer A. Smith; Jerome I. Rotter; Matthew Moll; Myriam Fornage; Noah Simon; Peter Castaldi; Ramon Casanova; Ren-Hua Chung; Robert Kaplan; Ruth J. F. Loos; Sharon L. R. Kardia; Stephen S. Rich; Susan Redline; Tanika Kelly; Timothy O’Connor; Wei Zhao; Wonji Kim; Xiuqing Guo; Yii-Der Ida Chen; The Trans-Omics in Precision Medicine Consortium; Tamar Sofer

doi:10.1038/s41598-024-62945-9

Scientific Reports (May 2024)

Machine learning models for predicting blood pressure phenotypes by combining multiple polygenic risk scores

Yana Hrytsenko,
Benjamin Shea,
Michael Elgart,
Nuzulul Kurniansyah,
Genevieve Lyons,
Alanna C. Morrison,
April P. Carson,
Bernhard Haring,
Braxton D. Mitchell,
Bruce M. Psaty,
Byron C. Jaeger,
C. Charles Gu,
Charles Kooperberg,
Daniel Levy,
Donald Lloyd-Jones,
Eunhee Choi,
Jennifer A. Brody,
Jennifer A. Smith,
Jerome I. Rotter,
Matthew Moll,
Myriam Fornage,
Noah Simon,
Peter Castaldi,
Ramon Casanova,
Ren-Hua Chung,
Robert Kaplan,
Ruth J. F. Loos,
Sharon L. R. Kardia,
Stephen S. Rich,
Susan Redline,
Tanika Kelly,
Timothy O’Connor,
Wei Zhao,
Wonji Kim,
Xiuqing Guo,
Yii-Der Ida Chen,
The Trans-Omics in Precision Medicine Consortium,
Tamar Sofer

Affiliations

Yana Hrytsenko: Department of Medicine, Brigham and Women’s Hospital
Benjamin Shea: CardioVascular Institute (CVI), Beth Israel Deaconess Medical Center
Michael Elgart: Department of Medicine, Brigham and Women’s Hospital
Nuzulul Kurniansyah: Department of Medicine, Brigham and Women’s Hospital
Genevieve Lyons: Department of Biostatistics, Harvard T.H. Chan School of Public Health
Alanna C. Morrison: Department of Epidemiology, School of Public Health, Human Genetics Center, The University of Texas Health Science Center at Houston
April P. Carson: Department of Medicine, University of Mississippi Medical Center
Bernhard Haring: Department of Epidemiology & Population Health, Albert Einstein College of Medicine
Braxton D. Mitchell: Department of Medicine, University of Maryland School of Medicine
Bruce M. Psaty: Department of Medicine, University of Washington
Byron C. Jaeger: Department of Biostatistics and Data Science, Wake Forest University School of Medicine
C. Charles Gu: The Center for Biostatistics and Data Science, Washington University
Charles Kooperberg: Division of Public Health Sciences, Fred Hutchinson Cancer Center
Daniel Levy: The Population Sciences Branch of the National Heart, Lung and Blood Institute
Donald Lloyd-Jones: Department of Preventive Medicine, Northwestern University
Eunhee Choi: Columbia Hypertension Laboratory, Department of Medicine, Columbia University Irving Medical Center
Jennifer A. Brody: Department of Medicine, University of Washington
Jennifer A. Smith: Department of Epidemiology, School of Public Health, University of Michigan
Jerome I. Rotter: Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center
Matthew Moll: Department of Medicine, Brigham and Women’s Hospital
Myriam Fornage: Department of Epidemiology, School of Public Health, Human Genetics Center, The University of Texas Health Science Center at Houston
Noah Simon: Department of Biostatistics, School of Public Health, University of Washington
Peter Castaldi: Department of Medicine, Brigham and Women’s Hospital
Ramon Casanova: Department of Biostatistics and Data Science, Wake Forest University School of Medicine
Ren-Hua Chung: Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes
Robert Kaplan: Department of Epidemiology & Population Health, Albert Einstein College of Medicine
Ruth J. F. Loos: The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai
Sharon L. R. Kardia: Department of Epidemiology, School of Public Health, University of Michigan
Stephen S. Rich: Center for Public Health Genomics, University of Virginia School of Medicine
Susan Redline: Department of Medicine, Brigham and Women’s Hospital
Tanika Kelly: Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine
Timothy O’Connor: Department of Medicine, University of Maryland School of Medicine
Wei Zhao: Department of Epidemiology, School of Public Health, University of Michigan
Wonji Kim: Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital
Xiuqing Guo: Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center
Yii-Der Ida Chen: Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center
The Trans-Omics in Precision Medicine Consortium
Tamar Sofer: Department of Medicine, Brigham and Women’s Hospital

DOI: https://doi.org/10.1038/s41598-024-62945-9
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 17

Abstract

Read online

Abstract We construct non-linear machine learning (ML) prediction models for systolic and diastolic blood pressure (SBP, DBP) using demographic and clinical variables and polygenic risk scores (PRSs). We developed a two-model ensemble, consisting of a baseline model, where prediction is based on demographic and clinical variables only, and a genetic model, where we also include PRSs. We evaluate the use of a linear versus a non-linear model at both the baseline and the genetic model levels and assess the improvement in performance when incorporating multiple PRSs. We report the ensemble model’s performance as percentage variance explained (PVE) on a held-out test dataset. A non-linear baseline model improved the PVEs from 28.1 to 30.1% (SBP) and 14.3% to 17.4% (DBP) compared with a linear baseline model. Including seven PRSs in the genetic model computed based on the largest available GWAS of SBP/DBP improved the genetic model PVE from 4.8 to 5.1% (SBP) and 4.7 to 5% (DBP) compared to using a single PRS. Adding additional 14 PRSs computed based on two independent GWASs further increased the genetic model PVE to 6.3% (SBP) and 5.7% (DBP). PVE differed across self-reported race/ethnicity groups, with primarily all non-White groups benefitting from the inclusion of additional PRSs. In summary, non-linear ML models improves BP prediction in models incorporating diverse populations.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal