PLoS ONE (Jun 2007)
The information coded in the yeast response elements accounts for most of the topological properties of its transcriptional regulation network.
Abstract
The regulation of gene expression in a cell relies to a major extent on transcription factors, proteins which recognize and bind the DNA at specific binding sites (response elements) within promoter regions associated with each gene. We present an information theoretic approach to modeling transcriptional regulatory networks, in terms of a simple "sequence-matching" rule and the statistics of the occurrence of binding sequences of given specificity in random promoter regions. The crucial biological input is the distribution of the amount of information coded in these cognate response elements and the length distribution of the promoter regions. We provide an analysis of the transcriptional regulatory network of yeast Saccharomyces cerevisiae, which we extract from the available databases, with respect to the degree distributions, clustering coefficient, degree correlations, rich-club coefficient and the k-core structure. We find that these topological features are in remarkable agreement with those predicted by our model, on the basis of the amount of information coded in the interaction between the transcription factors and response elements.