Cells (Sep 2024)

Importance of Transcript Variants in Transcriptome Analyses

  • Kevin Vo,
  • Yashica Sharma,
  • Anohita Paul,
  • Ryan Mohamadi,
  • Amelia Mohamadi,
  • Patrick E. Fields,
  • M. A. Karim Rumi

DOI
https://doi.org/10.3390/cells13171502
Journal volume & issue
Vol. 13, no. 17
p. 1502

Abstract

Read online

RNA sequencing (RNA-Seq) has become a widely adopted technique for studying gene expression. However, conventional RNA-Seq analyses rely on gene expression (GE) values that aggregate all the transcripts produced under a single gene identifier, overlooking the complexity of transcript variants arising from different transcription start sites or alternative splicing. Transcript variants may encode proteins with diverse functional domains, or noncoding RNAs. This study explored the implications of neglecting transcript variants in RNA-Seq analyses. Among the 1334 transcription factor (TF) genes expressed in mouse embryonic stem (ES) or trophoblast stem (TS) cells, 652 were differentially expressed in TS cells based on GE values (365 upregulated and 287 downregulated, ≥absolute 2-fold changes, false discovery rate (FDR) p-value ≤ 0.05). The 365 upregulated genes expressed 883 transcript variants. Further transcript expression (TE) based analyses identified only 174 ( 0.05) between ES and TS cells expressed 2215 transcript variants. These included 477 (>21%) differentially expressed transcripts (276 upregulated and 201 downregulated, ≥absolute 2-fold changes, FDR p-value ≤ 0.05). Hence, GE based RNA-Seq analyses do not represent accurate expression levels due to divergent transcripts expression from the same gene. Our findings show that by including transcript variants in RNA-Seq analyses, we can generate a precise understanding of a gene’s functional and regulatory landscape; ignoring the variants may result in an erroneous interpretation.

Keywords