A benchmarking study of SARS-CoV-2 whole-genome sequencing protocols using COVID-19 patient samples
Tiantian Liu,
Zhong Chen,
Wanqiu Chen,
Xin Chen,
Maryam Hosseini,
Zhaowei Yang,
Jing Li,
Diana Ho,
David Turay,
Ciprian P. Gheorghe,
Wendell Jones,
Charles Wang
Affiliations
Tiantian Liu
Center for Genomics, School of Medicine, Loma Linda University, Loma Linda, CA, USA
Zhong Chen
Center for Genomics, School of Medicine, Loma Linda University, Loma Linda, CA, USA
Wanqiu Chen
Center for Genomics, School of Medicine, Loma Linda University, Loma Linda, CA, USA
Xin Chen
Center for Genomics, School of Medicine, Loma Linda University, Loma Linda, CA, USA; Division of Microbiology & Molecular Genetics, Department of Basic Science, School of Medicine, Loma Linda University, Loma Linda, CA, USA
Maryam Hosseini
Center for Genomics, School of Medicine, Loma Linda University, Loma Linda, CA, USA
Zhaowei Yang
Center for Genomics, School of Medicine, Loma Linda University, Loma Linda, CA, USA; Department of Allergy and Clinical Immunology, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou, Guangdong Province, People's Republic of China
Jing Li
Division of Microbiology & Molecular Genetics, Department of Basic Science, School of Medicine, Loma Linda University, Loma Linda, CA, USA; Department of Allergy and Clinical Immunology, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou, Guangdong Province, People's Republic of China
Diana Ho
Center for Genomics, School of Medicine, Loma Linda University, Loma Linda, CA, USA
David Turay
Department of Surgery, School of Medicine, Loma Linda University, Loma Linda, CA, USA
Ciprian P. Gheorghe
Department of Gynecology & Obstetrics, School of Medicine, Loma Linda University, Loma Linda, CA, USA
Wendell Jones
EA Genomics, Division of Q2 Solutions, Morrisville, NC, USA; Corresponding author
Charles Wang
Center for Genomics, School of Medicine, Loma Linda University, Loma Linda, CA, USA; Division of Microbiology & Molecular Genetics, Department of Basic Science, School of Medicine, Loma Linda University, Loma Linda, CA, USA; Corresponding author
Summary: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is an emerging new type of coronavirus that is responsible for the COVID-19 pandemic and the unprecedented global health emergency. Whole-genome sequencing (WGS) of SARS-CoV-2 plays a critical role in understanding the disease. Performance variation exists across SARS-CoV-2 viral WGS technologies, but there is currently no benchmarking study comparing different WGS sequencing protocols. We compared seven different SARS-CoV-2 WGS library protocols using RNA from patient nasopharyngeal swab samples under two storage conditions with low and high viral inputs. We found large differences in mappability and genome coverage, and variations in sensitivity, reproducibility, and precision of single-nucleotide variant calling across different protocols. For certain amplicon-based protocols, an appropriate primer trimming step is critical for accurate single-nucleotide variant calling. We ranked the performance of protocols based on six different metrics. Our findings offer guidance in choosing appropriate WGS protocols to characterize SARS-CoV-2 and its evolution.