PLoS Computational Biology (Oct 2021)
NanoMethViz: An R/Bioconductor package for visualizing long-read methylation data
Abstract
A key benefit of long-read nanopore sequencing technology is the ability to detect modified DNA bases, such as 5-methylcytosine. The lack of R/Bioconductor tools for the effective visualization of nanopore methylation profiles between samples from different experimental groups led us to develop the NanoMethViz R package. Our software can handle methylation output generated from a range of different methylation callers and manages large datasets using a compressed data format. To fully explore the methylation patterns in a dataset, NanoMethViz allows plotting of data at various resolutions. At the sample-level, we use dimensionality reduction to look at the relationships between methylation profiles in an unsupervised way. We visualize methylation profiles of classes of features such as genes or CpG islands by scaling them to relative positions and aggregating their profiles. At the finest resolution, we visualize methylation patterns across individual reads along the genome using the spaghetti plot and heatmaps, allowing users to explore particular genes or genomic regions of interest. In summary, our software makes the handling of methylation signal more convenient, expands upon the visualization options for nanopore data and works seamlessly with existing methylation analysis tools available in the Bioconductor project. Our software is available at https://bioconductor.org/packages/NanoMethViz. Author summary Recently developed nanopore sequencing technology enables DNA methylation measurement on long DNA molecules. This technology provides a new tool for investigating DNA methylation, a form of DNA modification that plays an essential role in early development, and is linked to some forms of cancer through adulthood. There is a lack of R/Bioconductor software for effective visualization of methylation calls based on nanopore platforms, which hinders the analysis and presentation of results. We developed NanoMethViz, the first R package to create visualizations for nanopore methylation data at various summary resolutions. NanoMethViz produces publication-quality plots to inspect the broad differences in methylation profiles of different samples, the aggregated methylation profiles of classes of genomic features, and the methylation profiles of individual long reads. Our software provides an efficient data format for storing methylation information and converts data from popular methylation calling software to formats recognized by statistical methods available in the Bioconductor toolkit for further analysis. NanoMethViz allows researchers to more quickly and effectively analyze their data and produce high-quality figures to present their results.