PLoS Computational Biology (Jun 2023)
Ten simple rules for the sharing of bacterial genotype—Phenotype data on antimicrobial resistance
Abstract
The increasing availability of high-throughput sequencing (frequently termed next-generation sequencing (NGS)) data has created opportunities to gain deeper insights into the mechanisms of a number of diseases and is already impacting many areas of medicine and public health. The area of infectious diseases stands somewhat apart from other human diseases insofar as the relevant genomic data comes from the microbes rather than their human hosts. A particular concern about the threat of antimicrobial resistance (AMR) has driven the collection and reporting of large-scale datasets containing information from microbial genomes together with antimicrobial susceptibility test (AST) results. Unfortunately, the lack of clear standards or guiding principles for the reporting of such data is hampering the field’s advancement. We therefore present our recommendations for the publication and sharing of genotype and phenotype data on AMR, in the form of 10 simple rules. The adoption of these recommendations will enhance AMR data interoperability and help enable its large-scale analyses using computational biology tools, including mathematical modelling and machine learning. We hope that these rules can shed light on often overlooked but nonetheless very necessary aspects of AMR data sharing and enhance the field’s ability to address the problems of understanding AMR mechanisms, tracking their emergence and spread in populations, and predicting microbial susceptibility to antimicrobials for diagnostic purposes. Author summary The growing worldwide threat of antimicrobial resistance (AMR) makes the sharing of resistance data a priority for both researchers and public health practitioners. In particular, the growth of high-throughput sequencing data, in conjunction with AMR phenotypes, has the potential to revolutionise AMR diagnostics and surveillance. However, there is a significant heterogeneity in the ways that this type of data is currently shared, which makes it challenging to perform its analysis at scale. As both producers and users of publicly available genotype–phenotype data on AMR, we propose 10 simple rules that can mitigate this situation and nudge the field towards the adoption of better practices.