BMC Bioinformatics (May 2022)

Gene co-expression network based on part mutual information for gene-to-gene relationship and gene-cancer correlation analysis

  • Yi-Hua Jiang,
  • Jie Long,
  • Zhi-Bin Zhao,
  • Liang Li,
  • Zhe-Xiong Lian,
  • Zhi Liang,
  • Jia-Rui Wu

DOI
https://doi.org/10.1186/s12859-022-04732-9
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 22

Abstract

Read online

Abstract Background Finding correlation patterns is an important goal of analyzing biological data. Currently available methods for correlation analysis mainly use non-direct associations, such as the Pearson correlation coefficient, and focus on the interpretation of networks at the level of modules. For biological objects such as genes, their collective function depends on pairwise gene-to-gene interactions. However, a large amount of redundant results from module level methods often necessitate further detailed analysis of gene interactions. New approaches of measuring direct associations among variables, such as the part mutual information (PMI), may help us better interpret the correlation pattern of biological data at the level of variable pairs. Results We use PMI to calculate gene co-expression networks of cancer mRNA transcriptome data. Our results show that the PMI-based networks with fewer edges could represent the correlation pattern and are robust across biological conditions. The PMI-based networks recall significantly more important parts of omics defined gene-pair relationships than the Pearson Correlation Coefficient (PCC)-based networks. Based on the scores derived from PMI-recalled copy number variation or DNA methylation gene-pairs, the patients with cancer can be divided into groups with significant differences on disease specific survival. Conclusions PMI, measuring direct associations between variables, extracts more important biological relationships at the level of gene pairs than conventional indirect association measures do. It can be used to refine module level results from other correlation methods. Particularly, PMI is beneficial to analysis of biological data of the complicated systems, for example, cancer transcriptome data.

Keywords