Computational and Structural Biotechnology Journal (Jan 2022)
A unified platform enabling biomarker ranking and validation for 1562 drugs using transcriptomic data of 1250 cancer cell lines
Abstract
Intro: In vitro cell line models provide a valuable resource to investigate compounds useful in the systemic chemotherapy of cancer. However, the due to the dispersal of the data into several different databases, the utilization of these resources is limited. Here, our aim was to establish a platform enabling the validation of chemoresistance-associated genes and the ranking of available cell line models. Methods: We processed four independent databases, DepMap, GDSC1, GDSC2, and CTRP. The gene expression data was quantile normalized and HUGO gene names were assigned to have unambiguous identification of the genes. Resistance values were exported for all agents. The correlation between gene expression and therapy resistance is computed using ROC test. Results: We combined four datasets with chemosensitivity data of 1562 agents and transcriptome-level gene expression of 1250 cancer cell lines. We have set up an online tool utilizing this database to correlate available cell line sensitivity data and treatment response in a uniform analysis pipeline (www.rocplot.com/cells). We employed the established pipeline to by rank genes related to resistance against afatinib and lapatinib, two inhibitors of the tyrosine-kinase domain of ERBB2. Discussion: The computational tool is useful 1) to correlate gene expression with resistance, 2) to identify and rank resistant and sensitive cell lines, and 3) to rank resistance associated genes, cancer hallmarks, and gene ontology pathways. The platform will be an invaluable support to speed up cancer research by validating gene-resistance correlations and by selecting the best cell line models for new experiments.