Improved personalized survival prediction of patients with diffuse large B-cell Lymphoma using gene expression profiling

Adrián Mosquera Orgueira; José Ángel Díaz Arias; Miguel Cid López; Andrés Peleteiro Raíndo; Beatriz Antelo Rodríguez; Carlos Aliste Santos; Natalia Alonso Vence; Ángeles Bendaña López; Aitor Abuín Blanco; Laura Bao Pérez; Marta Sonia González Pérez; Manuel Mateo Pérez Encinas; Máximo Francisco Fraga Rodríguez; José Luis Bello López

doi:10.1186/s12885-020-07492-y

BMC Cancer (Oct 2020)

Improved personalized survival prediction of patients with diffuse large B-cell Lymphoma using gene expression profiling

Adrián Mosquera Orgueira,
José Ángel Díaz Arias,
Miguel Cid López,
Andrés Peleteiro Raíndo,
Beatriz Antelo Rodríguez,
Carlos Aliste Santos,
Natalia Alonso Vence,
Ángeles Bendaña López,
Aitor Abuín Blanco,
Laura Bao Pérez,
Marta Sonia González Pérez,
Manuel Mateo Pérez Encinas,
Máximo Francisco Fraga Rodríguez,
José Luis Bello López

Affiliations

Adrián Mosquera Orgueira: Health Research Institute of Santiago de Compostela (IDIS)
José Ángel Díaz Arias: Health Research Institute of Santiago de Compostela (IDIS)
Miguel Cid López: Health Research Institute of Santiago de Compostela (IDIS)
Andrés Peleteiro Raíndo: Health Research Institute of Santiago de Compostela (IDIS)
Beatriz Antelo Rodríguez: Health Research Institute of Santiago de Compostela (IDIS)
Carlos Aliste Santos: Health Research Institute of Santiago de Compostela (IDIS)
Natalia Alonso Vence: Health Research Institute of Santiago de Compostela (IDIS)
Ángeles Bendaña López: Health Research Institute of Santiago de Compostela (IDIS)
Aitor Abuín Blanco: Health Research Institute of Santiago de Compostela (IDIS)
Laura Bao Pérez: Health Research Institute of Santiago de Compostela (IDIS)
Marta Sonia González Pérez: Health Research Institute of Santiago de Compostela (IDIS)
Manuel Mateo Pérez Encinas: Health Research Institute of Santiago de Compostela (IDIS)
Máximo Francisco Fraga Rodríguez: Health Research Institute of Santiago de Compostela (IDIS)
José Luis Bello López: Health Research Institute of Santiago de Compostela (IDIS)

DOI: https://doi.org/10.1186/s12885-020-07492-y
Journal volume & issue: Vol. 20, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Thirty to forty percent of patients with Diffuse Large B-cell Lymphoma (DLBCL) have an adverse clinical evolution. The increased understanding of DLBCL biology has shed light on the clinical evolution of this pathology, leading to the discovery of prognostic factors based on gene expression data, genomic rearrangements and mutational subgroups. Nevertheless, additional efforts are needed in order to enable survival predictions at the patient level. In this study we investigated new machine learning-based models of survival using transcriptomic and clinical data. Methods Gene expression profiling (GEP) of in 2 different publicly available retrospective DLBCL cohorts were analyzed. Cox regression and unsupervised clustering were performed in order to identify probes associated with overall survival on the largest cohort. Random forests were created to model survival using combinations of GEP data, COO classification and clinical information. Cross-validation was used to compare model results in the training set, and Harrel’s concordance index (c-index) was used to assess model’s predictability. Results were validated in an independent test set. Results Two hundred thirty-three and sixty-four patients were included in the training and test set, respectively. Initially we derived and validated a 4-gene expression clusterization that was independently associated with lower survival in 20% of patients. This pattern included the following genes: TNFRSF9, BIRC3, BCL2L1 and G3BP2. Thereafter, we applied machine-learning models to predict survival. A set of 102 genes was highly predictive of disease outcome, outperforming available clinical information and COO classification. The final best model integrated clinical information, COO classification, 4-gene-based clusterization and the expression levels of 50 individual genes (training set c-index, 0.8404, test set c-index, 0.7942). Conclusion Our results indicate that DLBCL survival models based on the application of machine learning algorithms to gene expression and clinical data can largely outperform other important prognostic variables such as disease stage and COO. Head-to-head comparisons with other risk stratification models are needed to compare its usefulness.

Published in BMC Cancer

ISSN: 1471-2407 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Internal medicine: Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Website: http://bmccancer.biomedcentral.com

About the journal

Abstract

Keywords