BMC Bioinformatics (Jan 2022)

A novel nonparametric computational strategy for identifying differential methylation regions

  • Xifang Sun,
  • Donglin Wang,
  • Jiaqiang Zhu,
  • Shiquan Sun

DOI
https://doi.org/10.1186/s12859-022-04563-8
Journal volume & issue
Vol. 23, no. S1
pp. 1 – 10

Abstract

Read online

Abstract Background DNA methylation has long been known as an epigenetic gene silencing mechanism. For a motivating example, the methylomes of cancer and non-cancer cells show a number of methylation differences, indicating that certain features characteristics of cancer cells may be related to methylation characteristics. Robust methods for detecting differentially methylated regions (DMRs) could help scientists narrow down genome regions and even find biologically important regions. Although some statistical methods were developed for detecting DMR, there is no default or strongest method. Fisher’s exact test is direct, but not suitable for data with multiple replications, while regression-based methods usually come with a large number of assumptions. More complicated methods have been proposed, but those methods are often difficult to interpret. Results In this paper, we propose a three-step nonparametric kernel smoothing method that is both flexible and straightforward to implement and interpret. The proposed method relies on local quadratic fitting to find the set of equilibrium points (points at which the first derivative is 0) and the corresponding set of confidence windows. Potential regions are further refined using biological criteria, and finally selected based on a Bonferroni adjusted t-test cutoff. Using a comparison of three senescent and three proliferating cell lines to illustrate our method, we were able to identify a total of 1077 DMRs on chromosome 21. Conclusions We proposed a completely nonparametric, statistically straightforward, and interpretable method for detecting differentially methylated regions. Compared with existing methods, the non-reliance on model assumptions and the straightforward nature of our method makes it one competitive alternative to the existing statistical methods for defining DMRs.