BMC Genomics (Dec 2004)

Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes

  • Klos Joanna M,
  • Engström Pär G,
  • Bruce Sara,
  • Bailey Peter,
  • Sandelin Albin,
  • Wasserman Wyeth W,
  • Ericson Johan,
  • Lenhard Boris

DOI
https://doi.org/10.1186/1471-2164-5-99
Journal volume & issue
Vol. 5, no. 1
p. 99

Abstract

Read online

Abstract Background Evolutionarily conserved sequences within or adjoining orthologous genes often serve as critical cis-regulatory regions. Recent studies have identified long, non-coding genomic regions that are perfectly conserved between human and mouse, termed ultra-conserved regions (UCRs). Here, we focus on UCRs that cluster around genes involved in early vertebrate development; genes conserved over 450 million years of vertebrate evolution. Results Based on a high resolution detection procedure, our UCR set enables novel insights into vertebrate genome organization and regulation of developmentally important genes. We find that the genomic positions of deeply conserved UCRs are strongly associated with the locations of genes encoding key regulators of development, with particularly strong positional correlation to transcription factor-encoding genes. Of particular importance is the observation that most UCRs are clustered into arrays that span hundreds of kilobases around their presumptive target genes. Such a hallmark signature is present around several uncharacterized human genes predicted to encode developmentally important DNA-binding proteins. Conclusion The genomic organization of UCRs, combined with previous findings, suggests that UCRs act as essential long-range modulators of gene expression. The exceptional sequence conservation and clustered structure suggests that UCR-mediated molecular events involve greater complexity than traditional DNA binding by transcription factors. The high-resolution UCR collection presented here provides a wealth of target sequences for future experimental studies to determine the nature of the biochemical mechanisms involved in the preservation of arrays of nearly identical non-coding sequences over the course of vertebrate evolution.