PLoS Computational Biology (Aug 2021)

Model checking via testing for direct effects in Mendelian Randomization and transcriptome-wide association studies.

  • Yangqing Deng,
  • Wei Pan

DOI
https://doi.org/10.1371/journal.pcbi.1009266
Journal volume & issue
Vol. 17, no. 8
p. e1009266

Abstract

Read online

It is of great interest and potential to discover causal relationships between pairs of exposures and outcomes using genetic variants as instrumental variables (IVs) to deal with hidden confounding in observational studies. Two most popular approaches are Mendelian randomization (MR), which usually use independent genetic variants/SNPs across the genome, and transcriptome-wide association studies (TWAS) (or their generalizations) using cis-SNPs local to a gene (or some genome-wide and likely dependent SNPs), as IVs. In spite of their many promising applications, both approaches face a major challenge: the validity of their causal conclusions depends on three critical assumptions on valid IVs, and more generally on other modeling assumptions, which however may not hold in practice. The most likely as well as challenging situation is due to the wide-spread horizontal pleiotropy, leading to two of the three IV assumptions being violated and thus to biased statistical inference. More generally, we'd like to conduct a goodness-of-fit (GOF) test to check the model being used. Although some methods have been proposed as being robust to various degrees to the violation of some modeling assumptions, they often give different and even conflicting results due to their own modeling assumptions and possibly lower statistical efficiency, imposing difficulties to the practitioner in choosing and interpreting varying results across different methods. Hence, it would help to directly test whether any assumption is violated or not. In particular, there is a lack of such tests for TWAS. We propose a new and general GOF test, called TEDE (TEsting Direct Effects), applicable to both correlated and independent SNPs/IVs (as commonly used in TWAS and MR respectively). Through simulation studies and real data examples, we demonstrate high statistical power and advantages of our new method, while confirming the frequent violation of modeling (including valid IV) assumptions in practice and thus the importance of model checking by applying such a test in MR/TWAS analysis.