Center for Computational Molecular Biology, Brown University, Providence, United States; Department of Ecology and Evolutionary Biology, Brown University, Providence, United States; Department of Integrative Biology, The University of Texas at Austin, Austin, United States; Department of Population Health, The University of Texas at Austin, Austin, United States
Center for Computational Molecular Biology, Brown University, Providence, United States; Institute for Computational and Experimental Research in Mathematics, Brown University, Providence, United States
Dana Udwin
Department of Biostatistics, Brown University, Providence, United States
Julian Stamp
Center for Computational Molecular Biology, Brown University, Providence, United States
Department of Integrative Biology, The University of Texas at Austin, Austin, United States; Department of Population Health, The University of Texas at Austin, Austin, United States
Sohini Ramachandran
Center for Computational Molecular Biology, Brown University, Providence, United States; Department of Ecology and Evolutionary Biology, Brown University, Providence, United States; Data Science Institute, Brown University, Providence, United States
Center for Computational Molecular Biology, Brown University, Providence, United States; Department of Biostatistics, Brown University, Providence, United States; Microsoft, Cambridge, United States
LD score regression (LDSC) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it a fast and popular approach. In this work, we present interaction-LD score (i-LDSC) regression: an extension of the original LDSC framework that accounts for interactions between genetic variants. By studying a wide range of generative models in simulations, and by re-analyzing 25 well-studied quantitative phenotypes from 349,468 individuals in the UK Biobank and up to 159,095 individuals in BioBank Japan, we show that the inclusion of a cis-interaction score (i.e. interactions between a focal variant and proximal variants) recovers genetic variance that is not captured by LDSC. For each of the 25 traits analyzed in the UK Biobank and BioBank Japan, i-LDSC detects additional variation contributed by genetic interactions. The i-LDSC software and its application to these biobanks represent a step towards resolving further genetic contributions of sources of non-additive genetic effects to complex trait variation.