BMC Bioinformatics (Oct 2005)

Human promoter genomic composition demonstrates non-random groupings that reflect general cellular function

  • Montano Idalia,
  • Freebern Wendy J,
  • Collins Irene,
  • Cui Wenwu,
  • Tongbai Ron,
  • McNutt Markey C,
  • Haggerty Cynthia M,
  • Chandramouli GVR,
  • Gardner Kevin

DOI
https://doi.org/10.1186/1471-2105-6-259
Journal volume & issue
Vol. 6, no. 1
p. 259

Abstract

Read online

Abstract Background The purpose of this study is to determine whether or not there exists nonrandom grouping of cis-regulatory elements within gene promoters that can be perceived independent of gene expression data and whether or not there is any correlation between this grouping and the biological function of the gene. Results Using ProSpector, a web-based promoter search and annotation tool, we have applied an unbiased approach to analyze the transcription factor binding site frequencies of 1400 base pair genomic segments positioned at 1200 base pairs upstream and 200 base pairs downstream of the transcriptional start site of 7298 commonly studied human genes. Partitional clustering of the transcription factor binding site composition within these promoter segments reveals a small number of gene groups that are selectively enriched for gene ontology terms consistent with distinct aspects of cellular function. Significance ranking of the class-determining transcription factor binding sites within these clusters show substantial overlap between the gene ontology terms of the transcriptions factors associated with the binding sites and the gene ontology terms of the regulated genes within each group. Conclusion Thus, gene sorting by promoter composition alone produces partitions in which the "regulated" and the "regulators" cosegregate into similar functional classes. These findings demonstrate that the transcription factor binding site composition is non-randomly distributed between gene promoters in a manner that reflects and partially defines general gene class function.