Wadsworth Center, Division of Genetics, New York State Department of Health, Albany, United States; Department of Biomedical Sciences, School of Public Health, University at Albany, New York, United States
Todd A Gray
Wadsworth Center, Division of Genetics, New York State Department of Health, Albany, United States; Department of Biomedical Sciences, School of Public Health, University at Albany, New York, United States
Wadsworth Center, Division of Genetics, New York State Department of Health, Albany, United States; Department of Biomedical Sciences, School of Public Health, University at Albany, New York, United States
Most bacterial ORFs are identified by automated prediction algorithms. However, these algorithms often fail to identify ORFs lacking canonical features such as a length of >50 codons or the presence of an upstream Shine-Dalgarno sequence. Here, we use ribosome profiling approaches to identify actively translated ORFs in Mycobacterium tuberculosis. Most of the ORFs we identify have not been previously described, indicating that the M. tuberculosis transcriptome is pervasively translated. The newly described ORFs are predominantly short, with many encoding proteins of ≤50 amino acids. Codon usage of the newly discovered ORFs suggests that most have not been subject to purifying selection, and hence are unlikely to contribute to cell fitness. Nevertheless, we identify 90 new ORFs (median length of 52 codons) that bear the hallmarks of purifying selection. Thus, our data suggest that pervasive translation of short ORFs in Mycobacterium tuberculosis serves as a rich source for the evolution of new functional proteins.