Frontiers in Molecular Biosciences (Mar 2021)

Mining the Selective Remodeling of DNA Methylation in Promoter Regions to Identify Robust Gene-Level Associations With Phenotype

  • Yuan Quan,
  • Yuan Quan,
  • Fengji Liang,
  • Si-Min Deng,
  • Yuexing Zhu,
  • Yuexing Zhu,
  • Ying Chen,
  • Jianghui Xiong,
  • Jianghui Xiong,
  • Jianghui Xiong,
  • Jianghui Xiong

DOI
https://doi.org/10.3389/fmolb.2021.597513
Journal volume & issue
Vol. 8

Abstract

Read online

Epigenetics is an essential biological frontier linking genetics to the environment, where DNA methylation is one of the most studied epigenetic events. In recent years, through the epigenome-wide association study (EWAS), researchers have identified thousands of phenotype-related methylation sites. However, the overlaps of identified phenotype-related DNA methylation sites between various studies are often quite small, and it might be due to the fact that methylation remodeling has a certain degree of randomness within the genome. Thus, the identification of robust gene-phenotype associations is crucial to interpreting pathogenesis. How to integrate the methylation values of different sites on the same gene and to mine the DNA methylation at the gene level remains a challenge. A recent study found that the DNA methylation difference of the gene body and promoter region has a strong correlation with gene expression. In this study, we proposed a Statistical difference of DNA Methylation between Promoter and Other Body Region (SIMPO) algorithm to extract DNA methylation values at the gene level. First, by choosing to smoke as an environmental exposure factor, our method led to significant improvements in gene overlaps (from 5 to 17%) between different datasets. In addition, the biological significance of phenotype-related genes identified by SIMPO algorithm is comparable to that of the traditional probe-based methods. Then, we selected two disease contents (e.g., insulin resistance and Parkinson’s disease) to show that the biological efficiency of disease-related gene identification increased from 15.43 to 44.44% (p-value = 1.20e–28). In summary, our results declare that mining the selective remodeling of DNA methylation in promoter regions can identify robust gene-level associations with phenotype, and the characteristic remodeling of a given gene’s promoter region can reflect the essence of disease.

Keywords