Proteome-wide copy-number estimation from transcriptomics

Andrew J Sweatt; Cameron D Griffiths; Sarah M Groves; B Bishal Paudel; Lixin Wang; David F Kashatus; Kevin A Janes

doi:10.1038/s44320-024-00064-3

Molecular Systems Biology (Sep 2024)

Proteome-wide copy-number estimation from transcriptomics

Andrew J Sweatt,
Cameron D Griffiths,
Sarah M Groves,
B Bishal Paudel,
Lixin Wang,
David F Kashatus,
Kevin A Janes

Affiliations

Andrew J Sweatt: Department of Biomedical Engineering, University of Virginia
Cameron D Griffiths: Department of Biomedical Engineering, University of Virginia
Sarah M Groves: Department of Biomedical Engineering, University of Virginia
B Bishal Paudel: Department of Biomedical Engineering, University of Virginia
Lixin Wang: Department of Biomedical Engineering, University of Virginia
David F Kashatus: Department of Microbiology, Immunology & Cancer Biology, University of Virginia
Kevin A Janes: Department of Biomedical Engineering, University of Virginia

DOI: https://doi.org/10.1038/s44320-024-00064-3
Journal volume & issue: Vol. 20, no. 11
pp. 1230 – 1256

Abstract

Read online

Abstract Protein copy numbers constrain systems-level properties of regulatory networks, but proportional proteomic data remain scarce compared to RNA-seq. We related mRNA to protein statistically using best-available data from quantitative proteomics and transcriptomics for 4366 genes in 369 cell lines. The approach starts with a protein’s median copy number and hierarchically appends mRNA–protein and mRNA–mRNA dependencies to define an optimal gene-specific model linking mRNAs to protein. For dozens of cell lines and primary samples, these protein inferences from mRNA outmatch stringent null models, a count-based protein-abundance repository, empirical mRNA-to-protein ratios, and a proteogenomic DREAM challenge winner. The optimal mRNA-to-protein relationships capture biological processes along with hundreds of known protein-protein complexes, suggesting mechanistic relationships. We use the method to identify a viral-receptor abundance threshold for coxsackievirus B3 susceptibility from 1489 systems-biology infection models parameterized by protein inference. When applied to 796 RNA-seq profiles of breast cancer, inferred copy-number estimates collectively re-classify 26–29% of luminal tumors. By adopting a gene-centered perspective of mRNA–protein covariation across different biological contexts, we achieve accuracies comparable to the technical reproducibility of contemporary proteomics.

Published in Molecular Systems Biology

ISSN: 1744-4292 (Online)
Publisher: Springer Nature
Country of publisher: United Kingdom
LCC subjects: Science: Biology (General); Medicine: Medicine (General)
Website: https://www.embopress.org/journal/17444292

About the journal

Abstract

Keywords