RNA-seq data science: From raw data to effective interpretation

Dhrithi Deshpande; Karishma Chhugani; Yutong Chang; Aaron Karlsberg; Caitlin Loeffler; Jinyang Zhang; Agata Muszyńska; Agata Muszyńska; Viorel Munteanu; Harry Yang; Jeremy Rotman; Laura Tao; Brunilda Balliu; Elizabeth Tseng; Eleazar Eskin; Eleazar Eskin; Eleazar Eskin; Fangqing Zhao; Fangqing Zhao; Pejman Mohammadi; Paweł P. Łabaj; Paweł P. Łabaj; Serghei Mangul; Serghei Mangul

doi:10.3389/fgene.2023.997383

Frontiers in Genetics (Mar 2023)

RNA-seq data science: From raw data to effective interpretation

Dhrithi Deshpande,
Karishma Chhugani,
Yutong Chang,
Aaron Karlsberg,
Caitlin Loeffler,
Jinyang Zhang,
Agata Muszyńska,
Agata Muszyńska,
Viorel Munteanu,
Harry Yang,
Jeremy Rotman,
Laura Tao,
Brunilda Balliu,
Elizabeth Tseng,
Eleazar Eskin,
Eleazar Eskin,
Eleazar Eskin,
Fangqing Zhao,
Fangqing Zhao,
Pejman Mohammadi,
Paweł P. Łabaj,
Paweł P. Łabaj,
Serghei Mangul,
Serghei Mangul

Affiliations

Dhrithi Deshpande: Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Karishma Chhugani: Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Yutong Chang: Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Aaron Karlsberg: Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Caitlin Loeffler: Department of Computer Science, University of California, Los Angeles, CA, United States
Jinyang Zhang: Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
Agata Muszyńska: Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
Agata Muszyńska: Institute of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
Viorel Munteanu: Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, Moldova
Harry Yang: Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA, United States
Jeremy Rotman: Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Laura Tao: Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
Brunilda Balliu: Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
Elizabeth Tseng: 0Pacific Biosciences, Menlo Park, CA, United States
Eleazar Eskin: Department of Computer Science, University of California, Los Angeles, CA, United States
Eleazar Eskin: Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
Eleazar Eskin: 1Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, United States
Fangqing Zhao: Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
Fangqing Zhao: 2Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
Pejman Mohammadi: 3Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
Paweł P. Łabaj: Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
Paweł P. Łabaj: 4Department of Biotechnology, Boku University Vienna, Vienna, Austria
Serghei Mangul: Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Serghei Mangul: 5Department of Quantitative and Computational Biology, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, United States

DOI: https://doi.org/10.3389/fgene.2023.997383
Journal volume & issue: Vol. 14

Abstract

Read online

RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.

Published in Frontiers in Genetics

ISSN: 1664-8021 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Biology (General): Genetics
Website: http://journal.frontiersin.org/journal/genetics

About the journal

Abstract

Keywords