Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins
Sondos Samandi,
Annie V Roy,
Vivian Delcourt,
Jean-François Lucier,
Jules Gagnon,
Maxime C Beaudoin,
Benoît Vanderperre,
Marc-André Breton,
Julie Motard,
Jean-François Jacques,
Mylène Brunelle,
Isabelle Gagnon-Arsenault,
Isabelle Fournier,
Aida Ouangraoua,
Darel J Hunting,
Alan A Cohen,
Christian R Landry,
Michelle S Scott,
Xavier Roucou
Affiliations
Sondos Samandi
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada; PROTEO, Québec Network for Research on Protein Function, Structure and Engineering, Québec, Canada
Annie V Roy
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada; PROTEO, Québec Network for Research on Protein Function, Structure and Engineering, Québec, Canada
Vivian Delcourt
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada; PROTEO, Québec Network for Research on Protein Function, Structure and Engineering, Québec, Canada; INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM) F-59000 Lille, Université de Lille, Lille, France
Jean-François Lucier
Department of Biology, Université de Sherbrooke, Québec, Canada; Center for Scientific computing, Information Technologies Services,, Université de Sherbrooke, Québec, Canada
Jules Gagnon
Department of Biology, Université de Sherbrooke, Québec, Canada; Center for Scientific computing, Information Technologies Services,, Université de Sherbrooke, Québec, Canada
Maxime C Beaudoin
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada; PROTEO, Québec Network for Research on Protein Function, Structure and Engineering, Québec, Canada
Benoît Vanderperre
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada
Marc-André Breton
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada
Julie Motard
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada; PROTEO, Québec Network for Research on Protein Function, Structure and Engineering, Québec, Canada
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada; PROTEO, Québec Network for Research on Protein Function, Structure and Engineering, Québec, Canada
Mylène Brunelle
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada; PROTEO, Québec Network for Research on Protein Function, Structure and Engineering, Québec, Canada
Isabelle Gagnon-Arsenault
PROTEO, Québec Network for Research on Protein Function, Structure and Engineering, Québec, Canada; Département de biochimie, microbiologie et bioinformatique, Université Laval, Québec, Canada; IBIS, Université Laval, Québec, Canada
INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM) F-59000 Lille, Université de Lille, Lille, France
Aida Ouangraoua
Department of Computer Science, Université de Sherbrooke, Québec, Canada
Darel J Hunting
Department of Nuclear Medicine and Radiobiology, Université de Sherbrooke, Québec, Canada
Alan A Cohen
Department of Family Medicine, Université de Sherbrooke, Québec, Canada
Christian R Landry
PROTEO, Québec Network for Research on Protein Function, Structure and Engineering, Québec, Canada; Département de biochimie, microbiologie et bioinformatique, Université Laval, Québec, Canada; IBIS, Université Laval, Québec, Canada
Michelle S Scott
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada
Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Canada; PROTEO, Québec Network for Research on Protein Function, Structure and Engineering, Québec, Canada
Recent functional, proteomic and ribosome profiling studies in eukaryotes have concurrently demonstrated the translation of alternative open-reading frames (altORFs) in addition to annotated protein coding sequences (CDSs). We show that a large number of small proteins could in fact be coded by these altORFs. The putative alternative proteins translated from altORFs have orthologs in many species and contain functional domains. Evolutionary analyses indicate that altORFs often show more extreme conservation patterns than their CDSs. Thousands of alternative proteins are detected in proteomic datasets by reanalysis using a database containing predicted alternative proteins. This is illustrated with specific examples, including altMiD51, a 70 amino acid mitochondrial fission-promoting protein encoded in MiD51/Mief1/SMCR7L, a gene encoding an annotated protein promoting mitochondrial fission. Our results suggest that many genes are multicoding genes and code for a large protein and one or several small proteins.