BMC Bioinformatics (Sep 2024)

A novel approach to the analysis of Overall Survival (OS) as response with Progression-Free Interval (PFI) as condition based on the RNA-seq expression data in The Cancer Genome Atlas (TCGA)

  • Bo Lin,
  • Kaipeng Wang,
  • Yuan Yuan,
  • Yueguo Wang,
  • Qingyuan Liu,
  • Yulan Wang,
  • Jian Sun,
  • Wenwen Wang,
  • Huanli Wang,
  • Shusheng Zhou,
  • Kui Jin,
  • Mengping Zhang,
  • Yinglei Lai

DOI
https://doi.org/10.1186/s12859-024-05897-1
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 21

Abstract

Read online

Abstract Background Overall Survival (OS) and Progression-Free Interval (PFI) as survival times have been collected in The Cancer Genome Atlas (TCGA). It is of biomedical interest to consider their dependence in pathway detection and survival prediction. We intend to develop novel methods for integrating PFI as condition based on parametric survival models for identifying pathways associated with OS and predicting OS. Results Based on the framework of conditional probability, we developed a family of frailty-based parametric-models for this purpose, with exponential or Weibull distribution as baseline. We also considered two classes of existing methods with PFI as a covariate. We evaluated the performance of three approaches by analyzing RNA-seq expression data from TCGA for lung squamous cell carcinoma and lung adenocarcinoma (LUNG), brain lower grade glioma and glioblastoma multiforme (GBMLGG), as well as skin cutaneous melanoma (SKCM). Our focus was on fourteen general cancer-related pathways. The 10-fold cross-validation was employed for the evaluation of predictive accuracy. For LUNG, p53 signaling and cell cycle pathways were detected by all approaches. Furthermore, three approaches with the consideration of PFI demonstrated significantly better predictive performance compared to the approaches without the consideration of PFI. For GBMLGG, ten pathways (e.g., Wnt signaling, JAK-STAT signaling, ECM-receptor interaction, etc.) were detected by all approaches. Furthermore, three approaches with the consideration of PFI demonstrated better predictive performance compared to the approaches without the consideration of PFI. For SKCM, p53 signaling pathway was detected only by our Weibull-baseline-based model. And three approaches with the consideration of PFI demonstrated significantly better predictive performance compared to the approaches without the consideration of PFI. Conclusions Based on our study, it is necessary to incorporate PFI into the survival analysis of OS. Furthermore, PFI is a survival-type time, and improved results can be achieved by our conditional-probability-based approach.

Keywords