Transcriptomic pan‐cancer analysis using rank‐based Bayesian inference

Valeria Vitelli; Thomas Fleischer; Jørgen Ankill; Elja Arjas; Arnoldo Frigessi; Vessela N. Kristensen; Manuela Zucknick

doi:10.1002/1878-0261.13354

Molecular Oncology (Apr 2023)

Transcriptomic pan‐cancer analysis using rank‐based Bayesian inference

Valeria Vitelli,
Thomas Fleischer,
Jørgen Ankill,
Elja Arjas,
Arnoldo Frigessi,
Vessela N. Kristensen,
Manuela Zucknick

Affiliations

Valeria Vitelli: Oslo Centre for Biostatistics and Epidemiology University of Oslo Norway
Thomas Fleischer: Department of Cancer Genetics, Institute for Cancer Research Oslo University Hospital Norway
Jørgen Ankill: Department of Cancer Genetics, Institute for Cancer Research Oslo University Hospital Norway
Elja Arjas: Oslo Centre for Biostatistics and Epidemiology University of Oslo Norway
Arnoldo Frigessi: Oslo Centre for Biostatistics and Epidemiology University of Oslo Norway
Vessela N. Kristensen: Department of Medical Genetics Clinic for Laboratory Medicine Oslo University Hospital Norway
Manuela Zucknick: Oslo Centre for Biostatistics and Epidemiology University of Oslo Norway

DOI: https://doi.org/10.1002/1878-0261.13354
Journal volume & issue: Vol. 17, no. 4
pp. 548 – 563

Abstract

Read online

The analysis of whole genomes of pan‐cancer data sets provides a challenge for researchers, and we contribute to the literature concerning the identification of robust subgroups with clear biological interpretation. Specifically, we tackle this unsupervised problem via a novel rank‐based Bayesian clustering method. The advantages of our method are the integration and quantification of all uncertainties related to both the input data and the model, the probabilistic interpretation of final results to allow straightforward assessment of the stability of clusters leading to reliable conclusions, and the transparent biological interpretation of the identified clusters since each cluster is characterized by its top‐ranked genomic features. We applied our method to RNA‐seq data from cancer samples from 12 tumor types from the Cancer Genome Atlas. We identified a robust clustering that mostly reflects tissue of origin but also includes pan‐cancer clusters. Importantly, we identified three pan‐squamous clusters composed of a mix of lung squamous cell carcinoma, head and neck squamous carcinoma, and bladder cancer, with different biological functions over‐represented in the top genes that characterize the three clusters. We also found two novel subtypes of kidney cancer that show different prognosis, and we reproduced known subtypes of breast cancer. Taken together, our method allows the identification of robust and biologically meaningful clusters of pan‐cancer samples.

Published in Molecular Oncology

ISSN: 1574-7891 (Print); 1878-0261 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Medicine: Internal medicine: Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Website: https://febs.onlinelibrary.wiley.com/journal/18780261

About the journal

Abstract

Keywords