Computational and Structural Biotechnology Journal (Jan 2022)
A novel workflow for the qualitative analysis of DNA methylation data
Abstract
DNA methylation is an epigenetic modification that plays a pivotal role in major biological mechanisms, such as gene regulation, genomic imprinting, and genome stability. Different combinations of methylated cytosines for a given DNA locus generate different epialleles and alterations of these latter have been associated with several pathological conditions. Existing computational methods and statistical tests relevant to DNA methylation analysis are mostly based on the comparison of average CpG sites methylation levels and they often neglect non-CG methylation. Here, we present EpiStatProfiler, an R package that allows the analysis of CpG and non-CpG based epialleles starting from bisulfite sequencing data through a collection of dedicated extraction functions and statistical tests. EpiStatProfiler is provided with a set of useful auxiliary features, such as customizable genomic ranges, strand-specific epialleles analysis, locus annotation and gene set enrichment analysis. We showcase the package functionalities on two public datasets by identifying putative relevant loci in mice harboring the Huntington’s disease-causing Htt gene mutation and in Ctcf +/− mice compared to their wild-type counterparts. To our knowledge, EpiStatProfiler is the first package providing functionalities dedicated to the analysis of epialleles composition derived from any kind of bisulfite sequencing experiment.