PLoS ONE (Jan 2013)
DNA methylation patterns facilitate the identification of microRNA transcription start sites: a brain-specific study.
Abstract
Predicting the transcription start sites (TSSs) of microRNAs (miRNAs) is important for understanding how these small RNA molecules, known to regulate translation and stability of protein-coding genes, are regulated themselves. Previous approaches are primarily based on genetic features, trained on TSSs of protein-coding genes, and have low prediction accuracy. Recently, a support vector machine based technique has been proposed for miRNA TSS prediction that uses known miRNA TSS for training the classifier along with a set of existing and novel CpG island based features. Current progress in epigenetics research has provided genomewide and tissue-specific reports about various phenotypic traits. We hypothesize that incorporating epigenetic characteristics into statistical models may lead to better prediction of primary transcripts of human miRNAs. In this paper, we have tested our hypothesis on brain-specific miRNAs by using epigenetic as well as genetic features to predict the primary transcripts. For this, we have used a sophisticated feature selection technique and a robust classification model. Our prediction model achieves an accuracy of more than 80% and establishes the potential of epigenetic analysis for in silico prediction of TSSs.