Nature Communications (Aug 2023)

A versatile, fast and unbiased method for estimation of gene-by-environment interaction effects on biobank-scale datasets

  • Matteo Di Scipio,
  • Mohammad Khan,
  • Shihong Mao,
  • Michael Chong,
  • Conor Judge,
  • Nazia Pathan,
  • Nicolas Perrot,
  • Walter Nelson,
  • Ricky Lali,
  • Shuang Di,
  • Robert Morton,
  • Jeremy Petch,
  • Guillaume Paré

DOI
https://doi.org/10.1038/s41467-023-40913-7
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Identification of gene-by-environment interactions (GxE) is crucial to understand the interplay of environmental effects on complex traits. However, current methods evaluating GxE on biobank-scale datasets have limitations. We introduce MonsterLM, a multiple linear regression method that does not rely on model specification and provides unbiased estimates of variance explained by GxE. We demonstrate robustness of MonsterLM through comprehensive genome-wide simulations using real genetic data from 325,989 individuals. We estimate GxE using waist-to-hip-ratio, smoking, and exercise as the environmental variables on 13 outcomes (N = 297,529-325,989) in the UK Biobank. GxE variance is significant for 8 environment-outcome pairs, ranging from 0.009 – 0.071. The majority of GxE variance involves SNPs without strong marginal or interaction associations. We observe modest improvements in polygenic score prediction when incorporating GxE. Our results imply a significant contribution of GxE to complex trait variance and we show MonsterLM to be well-purposed to handle this with biobank-scale data.