BMC Medicine (Jul 2019)

GWAS and enrichment analyses of non-alcoholic fatty liver disease identify new trait-associated genes and pathways across eMERGE Network

  • Bahram Namjou,
  • Todd Lingren,
  • Yongbo Huang,
  • Sreeja Parameswaran,
  • Beth L. Cobb,
  • Ian B. Stanaway,
  • John J. Connolly,
  • Frank D. Mentch,
  • Barbara Benoit,
  • Xinnan Niu,
  • Wei-Qi Wei,
  • Robert J. Carroll,
  • Jennifer A. Pacheco,
  • Isaac T. W. Harley,
  • Senad Divanovic,
  • David S. Carrell,
  • Eric B. Larson,
  • David J. Carey,
  • Shefali Verma,
  • Marylyn D. Ritchie,
  • Ali G. Gharavi,
  • Shawn Murphy,
  • Marc S. Williams,
  • David R. Crosslin,
  • Gail P. Jarvik,
  • Iftikhar J. Kullo,
  • Hakon Hakonarson,
  • Rongling Li,
  • The eMERGE Network,
  • Stavra A. Xanthakos,
  • John B. Harley

DOI
https://doi.org/10.1186/s12916-019-1364-z
Journal volume & issue
Vol. 17, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Background Non-alcoholic fatty liver disease (NAFLD) is a common chronic liver illness with a genetically heterogeneous background that can be accompanied by considerable morbidity and attendant health care costs. The pathogenesis and progression of NAFLD is complex with many unanswered questions. We conducted genome-wide association studies (GWASs) using both adult and pediatric participants from the Electronic Medical Records and Genomics (eMERGE) Network to identify novel genetic contributors to this condition. Methods First, a natural language processing (NLP) algorithm was developed, tested, and deployed at each site to identify 1106 NAFLD cases and 8571 controls and histological data from liver tissue in 235 available participants. These include 1242 pediatric participants (396 cases, 846 controls). The algorithm included billing codes, text queries, laboratory values, and medication records. Next, GWASs were performed on NAFLD cases and controls and case-only analyses using histologic scores and liver function tests adjusting for age, sex, site, ancestry, PC, and body mass index (BMI). Results Consistent with previous results, a robust association was detected for the PNPLA3 gene cluster in participants with European ancestry. At the PNPLA3-SAMM50 region, three SNPs, rs738409, rs738408, and rs3747207, showed strongest association (best SNP rs738409 p = 1.70 × 10− 20). This effect was consistent in both pediatric (p = 9.92 × 10− 6) and adult (p = 9.73 × 10− 15) cohorts. Additionally, this variant was also associated with disease severity and NAFLD Activity Score (NAS) (p = 3.94 × 10− 8, beta = 0.85). PheWAS analysis link this locus to a spectrum of liver diseases beyond NAFLD with a novel negative correlation with gout (p = 1.09 × 10− 4). We also identified novel loci for NAFLD disease severity, including one novel locus for NAS score near IL17RA (rs5748926, p = 3.80 × 10− 8), and another near ZFP90-CDH1 for fibrosis (rs698718, p = 2.74 × 10− 11). Post-GWAS and gene-based analyses identified more than 300 genes that were used for functional and pathway enrichment analyses. Conclusions In summary, this study demonstrates clear confirmation of a previously described NAFLD risk locus and several novel associations. Further collaborative studies including an ethnically diverse population with well-characterized liver histologic features of NAFLD are needed to further validate the novel findings.

Keywords