Journal of Integrative Bioinformatics (Jun 2012)
Computer and Statistical Analysis of Transcription Factor Binding and Chromatin Modifications by ChIP-seq data in Embryonic Stem Cell
Abstract
Advances in high throughput sequencing technology have enabled the identification of transcription factor (TF) binding sites in genome scale. TF binding studies are important for medical applications and stem cell research. Somatic cells can be reprogrammed to a pluripotent state by the combined introduction of factors such as Oct4, Sox2, c-Myc, Klf4. These reprogrammed cells share many characteristics with embryonic stem cells (ESCs) and are known as induced pluripotent stem cells (iPSCs). The signaling requirements for maintenance of human and murine embryonic stem cells (ESCs) differ considerably. Genome wide ChIP-seq TF binding maps in mouse stem cells include Oct4, Sox2, Nanog, Tbx3, Smad2 as well as group of other factors. ChIP-seq allows study of new candidate transcription factors for reprogramming. It was shown that Nr5a2 could replace Oct4 for reprogramming. Epigenetic modifications play important role in regulation of gene expression adding additional complexity to transcription network functioning. We have studied associations between different histone modification using published data together with RNA Pol II sites. We found strong associations between activation marks and TF binding sites and present it qualitatively. To meet issues of statistical analysis of genome ChIP-sequencing maps we developed computer program to filter out noise signals and find significant association between binding site affinity and number of sequence reads. The data provide new insights into the function of chromatin organization and regulation in stem cells.