Nature Communications (Aug 2024)

SR-TWAS: leveraging multiple reference panels to improve transcriptome-wide association study power by ensemble machine learning

  • Randy L. Parrish,
  • Aron S. Buchman,
  • Shinya Tasaki,
  • Yanling Wang,
  • Denis Avey,
  • Jishu Xu,
  • Philip L. De Jager,
  • David A. Bennett,
  • Michael P. Epstein,
  • Jingjing Yang

DOI
https://doi.org/10.1038/s41467-024-50983-w
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Multiple reference panels of a given tissue or multiple tissues often exist, and multiple regression methods could be used for training gene expression imputation models for transcriptome-wide association studies (TWAS). To leverage expression imputation models (i.e., base models) trained with multiple reference panels, regression methods, and tissues, we develop a Stacked Regression based TWAS (SR-TWAS) tool which can obtain optimal linear combinations of base models for a given validation transcriptomic dataset. Both simulation and real studies show that SR-TWAS improves power, due to increased training sample sizes and borrowed strength across multiple regression methods and tissues. Leveraging base models across multiple reference panels, tissues, and regression methods, our real studies identify 6 independent significant risk genes for Alzheimer’s disease (AD) dementia for supplementary motor area tissue and 9 independent significant risk genes for Parkinson’s disease (PD) for substantia nigra tissue. Relevant biological interpretations are found for these significant risk genes.