F1000Research (Sep 2024)
Metagenome quality metrics and taxonomical annotation visualization through the integration of MAGFlow and BIgMAG [version 2; peer review: 1 approved, 2 approved with reservations]
Abstract
Background Building Metagenome–Assembled Genomes (MAGs) from highly complex metagenomics datasets encompasses a series of steps covering from cleaning the sequences, assembling them to finally group them into bins. Along the process, multiple tools aimed to assess the quality and integrity of each MAG are implemented. Nonetheless, even when incorporated within end–to–end pipelines, the outputs of these pieces of software must be visualized and analyzed manually lacking integration in a complete framework. Methods We developed a Nextflow pipeline (MAGFlow) for estimating the quality of MAGs through a wide variety of approaches (BUSCO, CheckM2, GUNC and QUAST), as well as for annotating taxonomically the metagenomes using GTDB-Tk2. MAGFlow is coupled to a Python–Dash application (BIgMAG) that displays the concatenated outcomes from the tools included by MAGFlow, highlighting the most important metrics in a single interactive environment along with a comparison/clustering of the input data. Results By using MAGFlow/BIgMAG, the user will be able to benchmark the MAGs obtained through different workflows or establish the quality of the MAGs belonging to different samples following the divide and rule methodology. Conclusions MAGFlow/BIgMAG represents a unique tool that integrates state-of-the-art tools to study different quality metrics and extract visually as much information as possible from a wide range of genome features.