BMC Biology (Sep 2020)

LSTrAP-Crowd: prediction of novel components of bacterial ribosomes with crowd-sourced analysis of RNA sequencing data

  • Benedict Hew,
  • Qiao Wen Tan,
  • William Goh,
  • Jonathan Wei Xiong Ng,
  • Marek Mutwil

DOI
https://doi.org/10.1186/s12915-020-00846-9
Journal volume & issue
Vol. 18, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background Bacterial resistance to antibiotics is a growing health problem that is projected to cause more deaths than cancer by 2050. Consequently, novel antibiotics are urgently needed. Since more than half of the available antibiotics target the structurally conserved bacterial ribosomes, factors involved in protein synthesis are thus prime targets for the development of novel antibiotics. However, experimental identification of these potential antibiotic target proteins can be labor-intensive and challenging, as these proteins are likely to be poorly characterized and specific to few bacteria. Here, we use a bioinformatics approach to identify novel components of protein synthesis. Results In order to identify these novel proteins, we established a Large-Scale Transcriptomic Analysis Pipeline in Crowd (LSTrAP-Crowd), where 285 individuals processed 26 terabytes of RNA-sequencing data of the 17 most notorious bacterial pathogens. In total, the crowd processed 26,269 RNA-seq experiments and used the data to construct gene co-expression networks, which were used to identify more than a hundred uncharacterized genes that were transcriptionally associated with protein synthesis. We provide the identity of these genes together with the processed gene expression data. Conclusions We identified genes related to protein synthesis in common bacterial pathogens and thus provide a resource of potential antibiotic development targets for experimental validation. The data can be used to explore additional vulnerabilities of bacteria, while our approach demonstrates how the processing of gene expression data can be easily crowd-sourced.

Keywords