PLoS ONE (Jan 2019)
Overcoming the pitfalls of automatic interpretation of whole genome sequencing data by online tools for the prediction of pyrazinamide resistance in Mycobacterium tuberculosis.
Abstract
ObjectivesAutomated online software tools that analyse whole genome sequencing (WGS) data without the need for bioinformatics expertise can motivate the implementation of WGS-based molecular drug susceptibility testing (DST) in routine diagnostic settings for tuberculosis (TB). Pyrazinamide (PZA) is a key drug for current and future TB treatment regimens; however, it was reported that predictive power for PZA resistance by the available tools is low. Therefore, this low predictive power may make users hesitant to use the tools. This study aimed to elucidate why and to uncover the real performance of the tools when taking into account their variation calling lists (manual inspection), not just their automated reporting system (default setting) that was evaluated by previous studies.MethodsWGS data from 191 datasets comprising 108 PZA-resistant and 83 susceptible strains were used to evaluate the potential performance of the available online tools (TB Profiler, TGS-TB, PhyResSE, and CASTB) for predicting phenotypic PZA resistance.ResultsWhen taking into consideration the variation calling lists, 73 variants in total (47 non-synonymous mutations and 26 indels) in pncA were detected by TGS-TB and PhyResSE, covering all mutations for the 108 PZA-resistant strains. The 73 variants were confirmed by Sanger sequencing. TB Profiler also detected all but three complete loss, two large deletion at the 3'-end, and one relatively large insertion of pncA. On the other hand, many of the 73 variants were lacking in the automated reporting systems except by TGS-TB; of these variants, CASTB detected only 20. By applying the 'non-wild type sequence' approach for predicting PZA resistance, accuracy of the results significantly improved compared with that of the automated results obtained by each tool.ConclusionUsers can obtain more accurate predictions for PZA resistance than previously reported by manually checking the results and applying the 'non-wild type sequence' approach.