Evaluating and improving health equity and fairness of polygenic scores

Tianyu Zhang; Geyu Zhou; Lambertus Klei; Peng Liu; Alexandra Chouldechova; Hongyu Zhao; Kathryn Roeder; Max G’Sell; Bernie Devlin

HGG Advances (Apr 2024)

Evaluating and improving health equity and fairness of polygenic scores

Tianyu Zhang,
Geyu Zhou,
Lambertus Klei,
Peng Liu,
Alexandra Chouldechova,
Hongyu Zhao,
Kathryn Roeder,
Max G’Sell,
Bernie Devlin

Affiliations

Tianyu Zhang: Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Corresponding author
Geyu Zhou: Department of Biostatistics, Yale University, New Haven, CT 06511, USA
Lambertus Klei: Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15213, USA
Peng Liu: Merck Research Laboratories, Merck & Co., Inc., Rahway, NJ 07065, USA
Alexandra Chouldechova: Microsoft Research NYC, New York, NY 10012, USA; Heinz College of Information Systems and Public Policy, Carnegie Mellon University, Pittsburgh, PA 15213, USA
Hongyu Zhao: Department of Biostatistics, Yale University, New Haven, CT 06511, USA
Kathryn Roeder: Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
Max G’Sell: Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
Bernie Devlin: Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15213, USA; Corresponding author

Journal volume & issue: Vol. 5, no. 2
p. 100280

Abstract

Read online

Summary: Polygenic scores (PGSs) are quantitative metrics for predicting phenotypic values, such as human height or disease status. Some PGS methods require only summary statistics of a relevant genome-wide association study (GWAS) for their score. One such method is Lassosum, which inherits the model selection advantages of Lasso to select a meaningful subset of the GWAS single-nucleotide polymorphisms as predictors from their association statistics. However, even efficient scores like Lassosum, when derived from European-based GWASs, are poor predictors of phenotype for subjects of non-European ancestry; that is, they have limited portability to other ancestries. To increase the portability of Lassosum, when GWAS information and estimates of linkage disequilibrium are available for both ancestries, we propose Joint-Lassosum (JLS). In the simulation settings we explore, JLS provides more accurate PGSs compared to other methods, especially when measured in terms of fairness. In analyses of UK Biobank data, JLS was computationally more efficient but slightly less accurate than a Bayesian comparator, SDPRX. Like all PGS methods, JLS requires selection of predictors, which are determined by data-driven tuning parameters. We describe a new approach to selecting tuning parameters and note its relevance for model selection for any PGS. We also draw connections to the literature on algorithmic fairness and discuss how JLS can help mitigate fairness-related harms that might result from the use of PGSs in clinical settings. While no PGS method is likely to be universally portable, due to the diversity of human populations and unequal information content of GWASs for different ancestries, JLS is an effective approach for enhancing portability and reducing predictive bias.

Published in HGG Advances

ISSN: 2666-2477 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Science: Biology (General): Genetics
Website: https://www.cell.com/hgg-advances/home

About the journal