Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type

Yen-Yi Liu; Bo-Han Chen; Chih-Chieh Chen; Chien-Shun Chiou

doi:10.7717/peerj.11842

PeerJ (Aug 2021)

Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type

Yen-Yi Liu,
Bo-Han Chen,
Chih-Chieh Chen,
Chien-Shun Chiou

Affiliations

Yen-Yi Liu: Department of Public Health, China Medical University, Taichung, Taiwan
Bo-Han Chen: Center for Research, Diagnostics and Vaccine Development, Centers for Disease Control, Ministry of Health and Welfare, Taichung, Taiwan
Chih-Chieh Chen: Institute of Medical Science and Technology, National Sun Yat-sen University, Kaohsiung, Taiwan
Chien-Shun Chiou: Center for Research, Diagnostics and Vaccine Development, Centers for Disease Control, Ministry of Health and Welfare, Taichung, Taiwan

DOI: https://doi.org/10.7717/peerj.11842
Journal volume & issue: Vol. 9
p. e11842

Abstract

Read online Read online

With the reduction in the cost of next-generation sequencing, whole-genome sequencing (WGS)–based methods such as core-genome multilocus sequence type (cgMLST) have been widely used. However, gene-based methods are required to assemble raw reads to contigs, thus possibly introducing errors into assemblies. Because the robustness of cgMLST depends on the quality of assemblies, the results of WGS should be assessed (from sequencing to assembly). In this study, we investigated the robustness of different read lengths, read depths, and assemblers in recovering genes from reference genomes. Different combinations of read lengths and read depths were simulated from the complete genomes of three common food-borne pathogens: Escherichia coli, Listeria monocytogenes, and Salmonella enterica. We found that the quality of assemblies was mainly affected by read depth, irrespective of the assembler used. In addition, we suggest several cutoff values for future cgMLST experiments. Furthermore, we recommend the combinations of read lengths, read depths, and assemblers that can result in a higher cost/performance ratio for cgMLST.

Published in PeerJ

ISSN: 2167-8359 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Medicine; Science: Biology (General)
Website: https://peerj.com/

About the journal

Abstract

Keywords