PLoS ONE (Jan 2016)

The Functional Human C-Terminome.

  • Surbhi Sharma,
  • Oniel Toledo,
  • Michael Hedden,
  • Kenneth F Lyon,
  • Steven B Brooks,
  • Roxanne P David,
  • Justin Limtong,
  • Jacklyn M Newsome,
  • Nemanja Novakovic,
  • Sanguthevar Rajasekaran,
  • Vishal Thapar,
  • Sean R Williams,
  • Martin R Schiller

DOI
https://doi.org/10.1371/journal.pone.0152731
Journal volume & issue
Vol. 11, no. 4
p. e0152731

Abstract

Read online

All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new "C-terminome" database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3-10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com.