Frontiers in Genetics (Feb 2020)

An Information Entropy-Based Approach for Computationally Identifying Histone Lysine Butyrylation

  • Guohua Huang,
  • Yang Zheng,
  • Yao-Qun Wu,
  • Yao-Qun Wu,
  • Guo-Sheng Han,
  • Zu-Guo Yu,
  • Zu-Guo Yu

DOI
https://doi.org/10.3389/fgene.2019.01325
Journal volume & issue
Vol. 10

Abstract

Read online

Butyrylation plays a crucial role in the cellular processes. Due to limit of techniques, it is a challenging task to identify histone butyrylation sites on a large scale. To fill the gap, we propose an approach based on information entropy and machine learning for computationally identifying histone butyrylation sites. The proposed method achieves 0.92 of area under the receiver operating characteristic (ROC) curve over the training set by 3-fold cross validation and 0.80 over the testing set by independent test. Feature analysis implies that amino acid residues in the down/upstream of butyrylation sites would exhibit specific sequence motif to a certain extent. Functional analysis suggests that histone butyrylation was most possibly associated with four pathways (systemic lupus erythematosus, alcoholism, viral carcinogenesis and transcriptional misregulation in cancer), was involved in binding with other molecules, processes of biosynthesis, assembly, arrangement or disassembly and was located in such complex as consists of DNA, RNA, protein, etc. The proposed method is useful to predict histone butyrylation sites. Analysis of feature and function improves understanding of histone butyrylation and increases knowledge of functions of butyrylated histones.

Keywords