Multi-species, multi-transcription factor binding highlights conserved control of tissue-specific biological pathways
Benoit Ballester,
Alejandra Medina-Rivera,
Dominic Schmidt,
Mar Gonzàlez-Porta,
Matthew Carlucci,
Xiaoting Chen,
Kyle Chessman,
Andre J Faure,
Alister PW Funnell,
Angela Goncalves,
Claudia Kutter,
Margus Lukk,
Suraj Menon,
William M McLaren,
Klara Stefflova,
Stephen Watt,
Matthew T Weirauch,
Merlin Crossley,
John C Marioni,
Duncan T Odom,
Paul Flicek,
Michael D Wilson
Affiliations
Benoit Ballester
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom; Aix-Marseille Université, UMR1090 TAGC, Marseille, France; INSERM, UMR1090 TAGC, Marseille, France
Alejandra Medina-Rivera
Genetics and Genome Biology Program, SickKids Research Institute, Toronto, Canada
Dominic Schmidt
Cancer Research UK–Cambridge Institute, University of Cambridge, Cambridge, United Kingdom
Mar Gonzàlez-Porta
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Matthew Carlucci
Genetics and Genome Biology Program, SickKids Research Institute, Toronto, Canada
Xiaoting Chen
School of Electronic and Computing Systems, University of Cincinnati, Cincinnati, United States
Kyle Chessman
Genetics and Genome Biology Program, SickKids Research Institute, Toronto, Canada
Andre J Faure
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Alister PW Funnell
School of Biotechnology and Biomolecular Sciences, University of New South Wales, Kensington, Australia
Angela Goncalves
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Claudia Kutter
Cancer Research UK–Cambridge Institute, University of Cambridge, Cambridge, United Kingdom
Margus Lukk
Cancer Research UK–Cambridge Institute, University of Cambridge, Cambridge, United Kingdom
Suraj Menon
Cancer Research UK–Cambridge Institute, University of Cambridge, Cambridge, United Kingdom
William M McLaren
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Klara Stefflova
Cancer Research UK–Cambridge Institute, University of Cambridge, Cambridge, United Kingdom
Stephen Watt
Cancer Research UK–Cambridge Institute, University of Cambridge, Cambridge, United Kingdom
Matthew T Weirauch
Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, United States; Divisions of Biomedical Informatics and Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, United States
Merlin Crossley
School of Biotechnology and Biomolecular Sciences, University of New South Wales, Kensington, Australia
John C Marioni
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Duncan T Odom
Cancer Research UK–Cambridge Institute, University of Cambridge, Cambridge, United Kingdom; Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Paul Flicek
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom; Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Michael D Wilson
Genetics and Genome Biology Program, SickKids Research Institute, Toronto, Canada; Cancer Research UK–Cambridge Institute, University of Cambridge, Cambridge, United Kingdom; Department of Molecular Genetics, University of Toronto, Toronto, Canada
As exome sequencing gives way to genome sequencing, the need to interpret the function of regulatory DNA becomes increasingly important. To test whether evolutionary conservation of cis-regulatory modules (CRMs) gives insight into human gene regulation, we determined transcription factor (TF) binding locations of four liver-essential TFs in liver tissue from human, macaque, mouse, rat, and dog. Approximately, two thirds of the TF-bound regions fell into CRMs. Less than half of the human CRMs were found as a CRM in the orthologous region of a second species. Shared CRMs were associated with liver pathways and disease loci identified by genome-wide association studies. Recurrent rare human disease causing mutations at the promoters of several blood coagulation and lipid metabolism genes were also identified within CRMs shared in multiple species. This suggests that multi-species analyses of experimentally determined combinatorial TF binding will help identify genomic regions critical for tissue-specific gene control.