Informatics in Medicine Unlocked (Jan 2022)

Computational methods for analyzing RNA-sequencing contaminated samples and its impact on cancer genome studies

  • Zahra Mortezaei

Journal volume & issue
Vol. 32
p. 101054

Abstract

Read online

A tumor is a group of cells with abnormal cell growth. Sequencing technology can help recognize genetic mutations that cause cancer. Next-generation sequencing (NGS) can supply genetic data and determine the number of mutant genes in different cancers by sequencing the whole genome, exome, and/or transcriptome. For a specific organism, ribonucleic acid (RNA) content is represented by the transcriptome containing information about different diseases, functional genome elements, and molecular components of tissues and cells. Whole transcriptome shotgun sequencing, known as RNA-Seq, is a technology that uses NGS to show a snapshot of RNA at a given time from millions of individual RNAs. There are some biases in RNA-Seq, which can be classified as nucleotide composition, guanine-cytosine (GC) content, insert size, cell type, and contaminations. In molecular biology, contaminations can lead to biases in the genetic analysis results and difficulties in observing viral infections. Here we reviewed RNA-Seq methodologies, different contaminations that may occur during RNA-Seq preparation, and some methods that can be applied to estimate the transcript abundances of contaminated samples.

Keywords