Confidence intervals of survival predictions with neural networks trained on molecular data

Elvire Roblin; Paul-Henry Cournède; Stefan Michiels

Informatics in Medicine Unlocked (Jan 2024)

Confidence intervals of survival predictions with neural networks trained on molecular data

Elvire Roblin,
Paul-Henry Cournède,
Stefan Michiels

Affiliations

Elvire Roblin: MICS - Laboratory of Mathematics and Computer Science, CentraleSupélec, Paris-Saclay Unversity, France; Oncostat U1018, Inserm, Paris-Saclay University, labeled Ligue Contre le Cancer, Villejuif, France; Department of Biostatistics and Epidemiology, Gustave Roussy, Paris-Saclay University, France; Corresponding author at: Oncostat U1018, Inserm, Paris-Saclay University, labeled Ligue Contre le Cancer, Villejuif, France.
Paul-Henry Cournède: MICS - Laboratory of Mathematics and Computer Science, CentraleSupélec, Paris-Saclay Unversity, France
Stefan Michiels: Oncostat U1018, Inserm, Paris-Saclay University, labeled Ligue Contre le Cancer, Villejuif, France; Department of Biostatistics and Epidemiology, Gustave Roussy, Paris-Saclay University, France

Journal volume & issue: Vol. 44
p. 101426

Abstract

Read online

In medicine, an important objective is predicting patients’ survival based on their molecular and clinical characteristics. In this context, neural networks have recently been used for their ability to capture complex interactions in the data. Measuring the uncertainty associated with survival estimates obtained by neural networks is essential to enhance predictions’ reliability. We compared four methods adapted to multilayer perceptrons (MLPs) for building confidence intervals at the patient level. The methods were based either on bootstrap with Boot (Efron, 1979), ensembling with DeepEns (Lakshminarayanan et al., 2016), or Monte-Carlo Dropout with MCDrop and BMask (Gal and Ghahramani, 2016; Mancini et al., 2020). A comparison was made through MLP-based survival models: CoxCC and CoxTime (Kvamme et al., 2019) in a continuous time framework, DeepHit (Lee et al., 2018) and PLANN (Biganzoli et al., 1998) in a discrete time framework. We applied the methods to a simulation study, enabling us to estimate a coverage rate of the estimated confidence intervals. We also applied them to real-world datasets, and predicted the survival probability for patients with breast cancer and patients with lung cancer.In the simulation study, CoxCC and CoxTime obtained the mean C-indices numerically closest to those from the Oracle model (mean C-index of 0.723 for CoxCC, 0.726 for CoxTime, versus 0.743 for the Oracle model). Regarding the confidence intervals of survival probabilities, Boot with CoxCC obtained a coverage rate of 96.5%, the closest to the nominal value of 95%. MCDrop was slightly anticonservative and obtained a coverage rate of 89.8% with CoxTime. This method may represent a reasonable compromise in terms of coverage with regards of computational time. In the breast cancer cohort, MLPs had difficulty capturing additional prognostic information from the molecular data. In contrast, in the lung cancer cohort, the models led to substantially stronger discrimination values when adding molecular data to the clinical variables. In conclusion, we were able to represent uncertainty in the survival estimates at particular time points at the patient level using MLPs in the form of 95% confidence intervals. We recommend using CoxTime with either Boot or, for a less intensive computation time, MCDrop.

Published in Informatics in Medicine Unlocked

ISSN: 2352-9148 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://www.journals.elsevier.com/informatics-in-medicine-unlocked/

About the journal

Abstract

Keywords