Automated evaluation of multiple sequence alignment methods to handle third generation sequencing errors

Coralie Rohmer; Hélène Touzet; Antoine Limasset

doi:10.7717/peerj.17731

PeerJ (Sep 2024)

Automated evaluation of multiple sequence alignment methods to handle third generation sequencing errors

Coralie Rohmer,
Hélène Touzet,
Antoine Limasset

Affiliations

Coralie Rohmer: Université de Lille, Lille, France
Hélène Touzet: Université de Lille, Lille, France
Antoine Limasset: Université de Lille, Lille, France

DOI: https://doi.org/10.7717/peerj.17731
Journal volume & issue: Vol. 12
p. e17731

Abstract

Read online Read online

Most third-generation sequencing (TGS) processing tools rely on multiple sequence alignment (MSA) methods to manage sequencing errors. Despite the broad range of MSA approaches available, a limited selection of implementations are commonly used in practice for this type of application, and no comprehensive comparative assessment of existing tools has been undertaken to date. In this context, we have developed an automatic pipeline, named MSA Limit, designed to facilitate the execution and evaluation of diverse MSA methods across a spectrum of conditions representative of TGS reads. MSA Limit offers insights into alignment accuracy, time efficiency, and memory utilization. It serves as a valuable resource for both users and developers, aiding in the assessment of algorithmic performance and assisting users in selecting the most appropriate tool for their specific experimental settings. Through a series of experiments using real and simulated data, we demonstrate the value of such exploration. Our findings reveal that in certain scenarios, popular methods may not consistently exhibit optimal efficiency and that the choice of the most effective method varies depending on factors such as sequencing depth, genome characteristics, and read error patterns. MSA Limit is an open source and freely available tool. All code and data pertaining to it and this manuscript are available at https://gitlab.cristal.univ-lille.fr/crohmer/msa-limit.

Published in PeerJ

ISSN: 2167-8359 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Medicine; Science: Biology (General)
Website: https://peerj.com/

About the journal

Abstract

Keywords