Communications Biology (Aug 2022)
Non-linear machine learning models incorporating SNPs and PRS improve polygenic prediction in diverse human populations
- Michael Elgart,
- Genevieve Lyons,
- Santiago Romero-Brufau,
- Nuzulul Kurniansyah,
- Jennifer A. Brody,
- Xiuqing Guo,
- Henry J. Lin,
- Laura Raffield,
- Yan Gao,
- Han Chen,
- Paul de Vries,
- Donald M. Lloyd-Jones,
- Leslie A. Lange,
- Gina M. Peloso,
- Myriam Fornage,
- Jerome I. Rotter,
- Stephen S. Rich,
- Alanna C. Morrison,
- Bruce M. Psaty,
- Daniel Levy,
- Susan Redline,
- the NHLBI’s Trans-Omics in Precision Medicine (TOPMed) Consortium,
- Tamar Sofer
Affiliations
- Michael Elgart
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital
- Genevieve Lyons
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital
- Santiago Romero-Brufau
- Department of Biostatistics, Harvard T.H. Chan School of Public Health
- Nuzulul Kurniansyah
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital
- Jennifer A. Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington
- Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center
- Henry J. Lin
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center
- Laura Raffield
- Department of Genetics, University of North Carolina
- Yan Gao
- The Jackson Heart Study, University of Mississippi Medical Center
- Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston
- Paul de Vries
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston
- Donald M. Lloyd-Jones
- Department of Preventive Medicine, Northwestern University
- Leslie A. Lange
- Department of Medicine, University of Colorado Denver, Anschutz Medical Campus
- Gina M. Peloso
- Department of Biostatistics, Boston University School of Public Health
- Myriam Fornage
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston
- Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center
- Stephen S. Rich
- Center for Public Health Genomics, University of Virginia School of Medicine
- Alanna C. Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston
- Bruce M. Psaty
- Cardiovascular Health Research Unit, Departments of Medicine, Epidemiology, and Health Services, University of Washington
- Daniel Levy
- The Population Sciences Branch of the National Heart, Lung and Blood Institute
- Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital
- the NHLBI’s Trans-Omics in Precision Medicine (TOPMed) Consortium
- Tamar Sofer
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital
- DOI
- https://doi.org/10.1038/s42003-022-03812-z
- Journal volume & issue
-
Vol. 5,
no. 1
pp. 1 – 12
Abstract
Combining a standard polygenic risk score (PRS) as a feature in a machine learning model increases the percentage variance explained for those traits, helping to account for non-linearities or interaction effects in genetics-based prediction models.