IEEE Access (Jan 2018)

DeepSS: Exploring Splice Site Motif Through Convolutional Neural Network Directly From DNA Sequence

  • Xiuquan Du,
  • Yu Yao,
  • Yanyu Diao,
  • Huaixu Zhu,
  • Yanping Zhang,
  • Shuo Li

DOI
https://doi.org/10.1109/ACCESS.2018.2848847
Journal volume & issue
Vol. 6
pp. 32958 – 32978

Abstract

Read online

Splice sites prediction and interpretation are crucial to the understanding of complicated mechanisms underlying gene transcriptional regulation. Although existing computational approaches can classify true/false splice sites, the performance mostly relies on a set of sequence- or structure-based features and model interpretability is relatively weak. In viewing of these challenges, we report a deep learning-based framework (DeepSS), which consists of DeepSS-C module to classify splice sites and DeepSS-M module to detect splice sites sequence pattern. Unlike previous feature construction and model training process, DeepSS-C module accomplishes feature learning during the whole model training. Compared with state-of-the-art algorithms, experimental results show that the DeepSS-C module yields more accurate performance on six publicly donor/acceptor splice sites data sets. In addition, the parameters of the trained DeepSS-M module are used for model interpretation and downstream analysis, including: 1) genome factors detection (the truly relevant motifs that induce the related biological process happen) via filters from deep learning perspective; 2) analyzing the ability of CNN filters on motifs detection; 3) co-analysis of filters and motifs on DNA sequence pattern. DeepSS is freely available at http://ailab.ahu.edu.cn:8087/DeepSS/index.html.

Keywords