A rapid and simple method for assessing and representing genome sequence relatedness

Briand, M; Bouzid, M; Hunault, G; Legeay, M; Fischer-Le Saux, M; Barret, M

doi:10.24072/pcjournal.37

Peer Community Journal (Nov 2021)

A rapid and simple method for assessing and representing genome sequence relatedness

Briand, M,
Bouzid, M,
Hunault, G,
Legeay, M,
Fischer-Le Saux, M,
Barret, M

Affiliations

Briand, M: Univ Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, F-49000 Angers, France
Bouzid, M: Univ Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, F-49000 Angers, France
Hunault, G: Université d’Angers, Laboratoire d’Hémodynamique, Interaction Fibrose et Invasivité tumorale hépatique, UPRES 3859, IFR 132, F-49045 Angers, France
Legeay, M: Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark
Fischer-Le Saux, M: Univ Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, F-49000 Angers, France
Barret, M: Univ Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, F-49000 Angers, France

DOI: https://doi.org/10.24072/pcjournal.37
Journal volume & issue: Vol. 1

Abstract

Read online

Coherent genomic groups are frequently used as a proxy for bacterial species delineation through computation of overall genome relatedness indices (OGRI). Average nucleotide identity (ANI) is a widely employed method for estimating relatedness between genomic sequences. However, pairwise comparisons of genome sequences based on ANI is relatively computationally intensive and therefore precludes analyses of large datasets composed of thousands of genome sequences.In this work we proposed a workflow to compute and visualize relationships between genomic sequences. A dataset containing more than 3,500 Pseudomonas genome sequences was successfully classified with an alternative OGRI based on k-mer counts in few hours with the same precision as ANI. A new visualization method based on zoomable circle packing was employed for assessing relationships among the 350 groups generated. Amendment of databases with these Pseudomonas groups greatly improved the classification of metagenomic read sets with k-mer-based classifier. The developed workflow was integrated in the user-friendly KI-S tool that is available at the following address: https://iris.angers.inra.fr/galaxypub-cfbp

Published in Peer Community Journal

ISSN: 2804-3871 (Online)
Publisher: Peer Community In
Country of publisher: France
LCC subjects: Auxiliary sciences of history: Archaeology; Science
Website: https://peercommunityjournal.org/

About the journal