Deep learning models predict regulatory variants in pancreatic islets and refine type 2 diabetes association signals

Agata Wesolowska-Andersen; Grace Zhuo Yu; Vibe Nylander; Fernando Abaitua; Matthias Thurner; Jason M Torres; Anubha Mahajan; Anna L Gloyn; Mark I McCarthy

doi:10.7554/eLife.51503

eLife (Jan 2020)

Deep learning models predict regulatory variants in pancreatic islets and refine type 2 diabetes association signals

Agata Wesolowska-Andersen,
Grace Zhuo Yu,
Vibe Nylander,
Fernando Abaitua,
Matthias Thurner,
Jason M Torres,
Anubha Mahajan,
Anna L Gloyn,
Mark I McCarthy

Affiliations

Agata Wesolowska-Andersen: ORCiD; Wellcome Centre for Human Genetics, Oxford, United Kingdom
Grace Zhuo Yu: Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Oxford, United Kingdom
Vibe Nylander: Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Oxford, United Kingdom
Fernando Abaitua: Wellcome Centre for Human Genetics, Oxford, United Kingdom
Matthias Thurner: ORCiD; Wellcome Centre for Human Genetics, Oxford, United Kingdom; Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Oxford, United Kingdom
Jason M Torres: ORCiD; Wellcome Centre for Human Genetics, Oxford, United Kingdom
Anubha Mahajan: Wellcome Centre for Human Genetics, Oxford, United Kingdom
Anna L Gloyn: ORCiD; Wellcome Centre for Human Genetics, Oxford, United Kingdom; Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Oxford, United Kingdom; Oxford NIHR Biomedical Centre, Churchill Hospital, Oxford, United Kingdom
Mark I McCarthy: ORCiD; Wellcome Centre for Human Genetics, Oxford, United Kingdom; Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Oxford, United Kingdom; Oxford NIHR Biomedical Centre, Churchill Hospital, Oxford, United Kingdom

DOI: https://doi.org/10.7554/eLife.51503
Journal volume & issue: Vol. 9

Abstract

Read online

Genome-wide association analyses have uncovered multiple genomic regions associated with T2D, but identification of the causal variants at these remains a challenge. There is growing interest in the potential of deep learning models - which predict epigenome features from DNA sequence - to support inference concerning the regulatory effects of disease-associated variants. Here, we evaluate the advantages of training convolutional neural network (CNN) models on a broad set of epigenomic features collected in a single disease-relevant tissue – pancreatic islets in the case of type 2 diabetes (T2D) - as opposed to models trained on multiple human tissues. We report convergence of CNN-based metrics of regulatory function with conventional approaches to variant prioritization – genetic fine-mapping and regulatory annotation enrichment. We demonstrate that CNN-based analyses can refine association signals at T2D-associated loci and provide experimental validation for one such signal. We anticipate that these approaches will become routine in downstream analyses of GWAS.

Published in eLife

ISSN: 2050-084X (Online)
Publisher: eLife Sciences Publications Ltd
Country of publisher: United Kingdom
LCC subjects: Medicine; Science: Biology (General)
Website: https://elifesciences.org

About the journal

Abstract

Keywords