Nature Communications (Aug 2024)

Rp3: Ribosome profiling-assisted proteogenomics improves coverage and confidence during microprotein discovery

  • Eduardo Vieira de Souza,
  • Angie L. Bookout,
  • Christopher A. Barnes,
  • Brendan Miller,
  • Pablo Machado,
  • Luiz A. Basso,
  • Cristiano V. Bizarro,
  • Alan Saghatelian

DOI
https://doi.org/10.1038/s41467-024-50301-4
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 14

Abstract

Read online

Abstract There has been a dramatic increase in the identification of non-canonical translation and a significant expansion of the protein-coding genome. Among the strategies used to identify unannotated small Open Reading Frames (smORFs) that encode microproteins, Ribosome profiling (Ribo-Seq) is the gold standard for the annotation of novel coding sequences by reporting on smORF translation. In Ribo-Seq, ribosome-protected footprints (RPFs) that map to multiple genomic sites are removed since they cannot be unambiguously assigned to a specific genomic location. Furthermore, RPFs necessarily result in short (25-34 nucleotides) reads, increasing the chance of multi-mapping alignments, such that smORFs residing in these regions cannot be identified by Ribo-Seq. Moreover, it has been challenging to identify protein evidence for Ribo-Seq. To solve this, we developed Rp3, a pipeline that integrates proteogenomics and Ribosome profiling to provide unambiguous evidence for a subset of microproteins missed by current Ribo-Seq pipelines. Here, we show that Rp3 maximizes proteomics detection and confidence of microprotein-encoding smORFs.