BMC Bioinformatics (Nov 2022)

iEnhancer-DCLA: using the original sequence to identify enhancers and their strength based on a deep learning framework

  • Meng Liao,
  • Jian-ping Zhao,
  • Jing Tian,
  • Chun-Hou Zheng

DOI
https://doi.org/10.1186/s12859-022-05033-x
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Enhancers are small regions of DNA that bind to proteins, which enhance the transcription of genes. The enhancer may be located upstream or downstream of the gene. It is not necessarily close to the gene to be acted on, because the entanglement structure of chromatin allows the positions far apart in the sequence to have the opportunity to contact each other. Therefore, identifying enhancers and their strength is a complex and challenging task. In this article, a new prediction method based on deep learning is proposed to identify enhancers and enhancer strength, called iEnhancer-DCLA. Firstly, we use word2vec to convert k-mers into number vectors to construct an input matrix. Secondly, we use convolutional neural network and bidirectional long short-term memory network to extract sequence features, and finally use the attention mechanism to extract relatively important features. In the task of predicting enhancers and their strengths, this method has improved to a certain extent in most evaluation indexes. In summary, we believe that this method provides new ideas in the analysis of enhancers.

Keywords