BMC Bioinformatics (Apr 2018)

Higher-order partial least squares for predicting gene expression levels from chromatin states

  • Shiquan Sun,
  • Xifang Sun,
  • Yan Zheng

DOI
https://doi.org/10.1186/s12859-018-2100-y
Journal volume & issue
Vol. 19, no. S5
pp. 47 – 54

Abstract

Read online

Abstract Background Extensive studies have shown that gene expression levels are strongly affected by chromatin mark combinations via at least two mechanisms, i.e., activation or repression. But their combinatorial patterns are still unclear. To further understand the relationship between histone modifications and gene expression levels, here in this paper, we introduce a purely geometric higher-order representation, tensor (also called multidimensional array), which might borrow more unknown interactions in chromatin states to predicting gene expression levels. Results The prediction models were learned from regions around upstream 10k base pairs and downstream 10k base pairs of the transcriptional start sites (TSSs) on three species (i.e., Human, Rhesus Macaque, and Chimpanzee) with five histone modifications (i.e., H3K4me1, H3K4me3, H3K27ac, H3K27me3, and Pol II). Experimental results demonstrate that the proposed method is more powerful to predicting gene expression levels than several other popular methods. Specifically, our method enable to get more powerful performance on both commonly used criteria, R and RMSE, as high as 1.7% and 11%, respectively. Conclusions The overall aim of this work is to show that the higher-order representation is able to include more unknown interaction information between histone modifications across different species.

Keywords