Multivariate adaptive shrinkage improves cross-population transcriptome prediction and association studies in underrepresented populations

Daniel S. Araujo; Chris Nguyen; Xiaowei Hu; Anna V. Mikhaylova; Chris Gignoux; Kristin Ardlie; Kent D. Taylor; Peter Durda; Yongmei Liu; George Papanicolaou; Michael H. Cho; Stephen S. Rich; Jerome I. Rotter; Hae Kyung Im; Ani Manichaikul; Heather E. Wheeler

HGG Advances (Oct 2023)

Multivariate adaptive shrinkage improves cross-population transcriptome prediction and association studies in underrepresented populations

Daniel S. Araujo,
Chris Nguyen,
Xiaowei Hu,
Anna V. Mikhaylova,
Chris Gignoux,
Kristin Ardlie,
Kent D. Taylor,
Peter Durda,
Yongmei Liu,
George Papanicolaou,
Michael H. Cho,
Stephen S. Rich,
Jerome I. Rotter,
Hae Kyung Im,
Ani Manichaikul,
Heather E. Wheeler

Affiliations

Daniel S. Araujo: Program in Bioinformatics, Loyola University Chicago, Chicago, IL 60660, USA
Chris Nguyen: Department of Biology, Loyola University Chicago, Chicago, IL 60660, USA
Xiaowei Hu: Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
Anna V. Mikhaylova: Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
Chris Gignoux: Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, UC Denver Anschutz Medical Campus, Aurora, CO 80045, USA
Kristin Ardlie: Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Kent D. Taylor: The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, the Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
Peter Durda: Laboratory for Clinical Biochemistry Research, University of Vermont, Colchester, VT 05446, USA
Yongmei Liu: Department of Medicine, Duke University School of Medicine, Durham, NC 27710, USA
George Papanicolaou: Epidemiology Branch, Division of Cardiovascular Sciences, National Heart, Lung and Blood Institute, Bethesda, MD 20892, USA
Michael H. Cho: Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA 02115, USA
Stephen S. Rich: Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
Jerome I. Rotter: The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, the Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
Hae Kyung Im: Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA
Ani Manichaikul: Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
Heather E. Wheeler: Program in Bioinformatics, Loyola University Chicago, Chicago, IL 60660, USA; Department of Biology, Loyola University Chicago, Chicago, IL 60660, USA; Corresponding author

Journal volume & issue: Vol. 4, no. 4
p. 100216

Abstract

Read online

Summary: Transcriptome prediction models built with data from European-descent individuals are less accurate when applied to different populations because of differences in linkage disequilibrium patterns and allele frequencies. We hypothesized that methods that leverage shared regulatory effects across different conditions, in this case, across different populations, may improve cross-population transcriptome prediction. To test this hypothesis, we made transcriptome prediction models for use in transcriptome-wide association studies (TWASs) using different methods (elastic net, joint-tissue imputation [JTI], matrix expression quantitative trait loci [Matrix eQTL], multivariate adaptive shrinkage in R [MASHR], and transcriptome-integrated genetic association resource [TIGAR]) and tested their out-of-sample transcriptome prediction accuracy in population-matched and cross-population scenarios. Additionally, to evaluate model applicability in TWASs, we integrated publicly available multiethnic genome-wide association study (GWAS) summary statistics from the Population Architecture using Genomics and Epidemiology (PAGE) study and Pan-ancestry genetic analysis of the UK Biobank (PanUKBB) with our developed transcriptome prediction models. In regard to transcriptome prediction accuracy, MASHR models performed better or the same as other methods in both population-matched and cross-population transcriptome predictions. Furthermore, in multiethnic TWASs, MASHR models yielded more discoveries that replicate in both PAGE and PanUKBB across all methods analyzed, including loci previously mapped in GWASs and loci previously not found in GWASs. Overall, our study demonstrates the importance of using methods that benefit from different populations’ effect size estimates in order to improve TWASs for multiethnic or underrepresented populations.

Published in HGG Advances

ISSN: 2666-2477 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Science: Biology (General): Genetics
Website: https://www.cell.com/hgg-advances/home

About the journal

Abstract

Keywords