BMC Genomics (Apr 2008)
Detailed characterization of the mouse embryonic stem cell transcriptome reveals novel genes and intergenic splicing associated with pluripotency
Abstract
Abstract Background Transcriptional control of embryonic stem (ES) cell pluripotency has been a subject of intense study. Transcriptional regulators including Oct4 (Oct3/4 index), Sox2 and Nanog are fundamental for maintaining the undifferentiated state. However, the ES cell transcriptome is not limited to their targets, and exhibits considerable complexity when assayed with microarray, MPSS, cDNA/EST sequencing, and SAGE technologies. To identify novel genes associated with pluripotency, we globally searched for ES transcripts not corresponding to known genes, validated their sequences, determined their expression profiles, and employed RNAi to test their function. Results Gene Identification Signature (GIS) analysis, a SAGE derivative distinguished by paired 5' and 3' transcript end tags, identified 153 candidate novel transcriptional units (TUs) distinct from known genes in a mouse E14 ES mRNA library. We focused on 16 TUs free of artefacts and mapping discrepancies, five of which were validated by RTPCR product sequencing. Two of the TUs were revealed by annotation to represent novel protein-coding genes: a PRY-domain cluster member and a KRAB-domain zinc finger. The other three TUs represented intergenic splicing events involving adjacent, functionally unrelated protein-coding genes transcribed in the same orientation, with one event potentially encoding a fusion protein containing domains from both component genes (Clk2 and Scamp3). Expression profiling using embryonic samples and adult tissue panels confirmed that three of the TUs were unique to or most highly expressed in ES cells. Expression levels of all five TUs dropped dramatically during three distinct chemically induced differentiation treatments of ES cells in culture. However, siRNA knockdowns of the TUs did not alter mRNA levels of pluripotency or differentiation markers, and did not affect cell morphology. Conclusion Transcriptome libraries retain considerable potential for novel gene discovery despite massive recent cDNA and EST sequencing efforts; cDNA and EST evidence for these ES cell TUs had been limited or absent. RTPCR and full-length sequencing remain essential in resolving the bottleneck between numerous candidate novel transcripts inferred from high-throughput sequencing and the small fraction that can be validated. RNAi results indicate that, despite their strong association with pluripotency, these five transcriptomic novelties may not be required for maintaining it.