Frontiers in Genetics (Aug 2020)

How Can Gene-Expression Information Improve Prognostic Prediction in TCGA Cancers: An Empirical Comparison Study on Regularization and Mixed Cox Models

  • Xinghao Yu,
  • Ting Wang,
  • Shuiping Huang,
  • Shuiping Huang,
  • Ping Zeng,
  • Ping Zeng

DOI
https://doi.org/10.3389/fgene.2020.00920
Journal volume & issue
Vol. 11

Abstract

Read online

BackgroundPrevious cancer prognostic prediction models often consider only the most important transcriptomic expressions, and their power is limited. It is unknown whether prediction power can be further improved when additional transcriptomic information is incorporated.MethodsTo integrate transcriptomes, four models are compared based on 32 types of cancer in the Cancer Genome Atlas, including the general Cox model with only clinical covariates, the Cox model with a lasso penalty (coxlasso), the Cox model with an elastic net penalty (coxenet), and the mixed-effects Cox model (coxlmm). Furthermore, we partition the survival variance into the relative contribution of clinical and transcriptomic components within the framework of coxlmm. Finally, the influence of different numbers of genes was evaluated in the context of coxlmm.ResultsCompared with the clinical covariates–only Cox model, the average prediction gain was 2.4% for coxlasso, 4.2% for coxenet, and 7.2% for coxlmm across 16 low-censored cancers; a significant elevation of prediction power was observed for SARC, SKCM, LGG, PAAD, and HNSC. Similar findings were observed for all 32 cancers with the average prediction gain of 2.7, 3.8, and 5.8% for coxlasso, coxenet, and coxlmm. Coxlmm always had comparable or better prediction performance relative to coxlasso and coxenet with an average of 2.8% prediction improvement across the 16 low-censored cancers. In addition, it is shown that the predictive accuracy of coxlmm generally increases with the number of genes included. The survival variance partition analysis demonstrates that the transcriptomic contribution was higher for some cancers (e.g., LGG, CESC, PAAD, SKCM, and SARC) and lower for others (e.g., BRCA, COAD, KIRC, and STAD).ConclusionThis study demonstrates that the integration of transcriptomic information can substantially improve prognostic prediction accuracy, but the prediction performance is cancer-specific and varies across cancer types. It further reveals that gene expression exhibits distinct contributions to survival variation across cancers.

Keywords