Genome Biology (Nov 2018)

CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise

  • Mihaela Pertea,
  • Alaina Shumate,
  • Geo Pertea,
  • Ales Varabyou,
  • Florian P. Breitwieser,
  • Yu-Chi Chang,
  • Anil K. Madugundu,
  • Akhilesh Pandey,
  • Steven L. Salzberg

DOI
https://doi.org/10.1186/s13059-018-1590-2
Journal volume & issue
Vol. 19, no. 1
pp. 1 – 14

Abstract

Read online

Abstract We assembled the sequences from deep RNA sequencing experiments by the Genotype-Tissue Expression (GTEx) project, to create a new catalog of human genes and transcripts, called CHESS. The new database contains 42,611 genes, of which 20,352 are potentially protein-coding and 22,259 are noncoding, and a total of 323,258 transcripts. These include 224 novel protein-coding genes and 116,156 novel transcripts. We detected over 30 million additional transcripts at more than 650,000 genomic loci, nearly all of which are likely nonfunctional, revealing a heretofore unappreciated amount of transcriptional noise in human cells. The CHESS database is available at http://ccb.jhu.edu/chess.

Keywords