Whole genome sequencing of group A Streptococcus: development and evaluation of an automated pipeline for emmgene typing

Georgia Kapatai; Juliana Coelho; Steven Platt; Victoria J. Chalker

doi:10.7717/peerj.3226

PeerJ (Apr 2017)

Whole genome sequencing of group A Streptococcus: development and evaluation of an automated pipeline for emmgene typing

Georgia Kapatai,
Juliana Coelho,
Steven Platt,
Victoria J. Chalker

Affiliations

Georgia Kapatai: Respiratory and Vaccine Preventable Bacterial Reference Unit, Public Health England, London, United Kingdom
Juliana Coelho: Respiratory and Vaccine Preventable Bacterial Reference Unit, Public Health England, London, United Kingdom
Steven Platt: Infectious Disease Informatics, Public Health England, London, United Kingdom
Victoria J. Chalker: Respiratory and Vaccine Preventable Bacterial Reference Unit, Public Health England, London, United Kingdom

DOI: https://doi.org/10.7717/peerj.3226
Journal volume & issue: Vol. 5
p. e3226

Abstract

Read online Read online

Streptococcus pyogenes group A Streptococcus (GAS) is the most common cause of bacterial throat infections, and can cause mild to severe skin and soft tissue infections, including impetigo, erysipelas, necrotizing fasciitis, as well as systemic and fatal infections including septicaemia and meningitis. Estimated annual incidence for invasive group A streptococcal infection (iGAS) in industrialised countries is approximately three per 100,000 per year. Typing is currently used in England and Wales to monitor bacterial strains of S. pyogenes causing invasive infections and those isolated from patients and healthcare/care workers in cluster and outbreak situations. Sequence analysis of the emm gene is the currently accepted gold standard methodology for GAS typing. A comprehensive database of emm types observed from superficial and invasive GAS strains from England and Wales informs outbreak control teams during investigations. Each year the Bacterial Reference Department, Public Health England (PHE) receives approximately 3,000 GAS isolates from England and Wales. In April 2014 the Bacterial Reference Department, PHE began genomic sequencing of referred S. pyogenes isolates and those pertaining to selected elderly/nursing care or maternity clusters from 2010 to inform future reference services and outbreak analysis (n = 3, 047). In line with the modernizing strategy of PHE, we developed a novel bioinformatics pipeline that can predict emmtypes using whole genome sequence (WGS) data. The efficiency of this method was measured by comparing the emmtype assigned by this method against the result from the current gold standard methodology; concordance to emmsubtype level was observed in 93.8% (2,852/3,040) of our cases, whereas in 2.4% (n = 72) of our cases concordance was observed to emm type level. The remaining 3.8% (n = 117) of our cases corresponded to novel types/subtypes, contamination, laboratory sample transcription errors or problems arising from high sequence similarity of the allele sequence or low mapping coverage. De novo assembly analysis was performed in the two latter groups (n = 72 + 117) and was able to diagnose the problem and where possible resolve the discordance (60/72 and 20/117, respectively). Overall, we have demonstrated that our WGS emm-typing pipeline is a reliable and robust system that can be implemented to determine emm type for the routine service.

Published in PeerJ

ISSN: 2167-8359 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Medicine; Science: Biology (General)
Website: https://peerj.com/

About the journal

Abstract

Keywords