Frontiers in Oncology (Apr 2023)

A comparison of machine learning models for predicting urinary incontinence in men with localized prostate cancer

  • Hajar Hasannejadasl,
  • Biche Osong,
  • Inigo Bermejo,
  • Henk van der Poel,
  • Ben Vanneste,
  • Ben Vanneste,
  • Joep van Roermund,
  • Katja Aben,
  • Katja Aben,
  • Zhen Zhang,
  • Lambertus Kiemeney,
  • Inge Van Oort,
  • Renee Verwey,
  • Laura Hochstenbach,
  • Esther Bloemen,
  • Esther Bloemen,
  • Andre Dekker,
  • Rianne R. R. Fijten

DOI
https://doi.org/10.3389/fonc.2023.1168219
Journal volume & issue
Vol. 13

Abstract

Read online

IntroductionUrinary incontinence (UI) is a common side effect of prostate cancer treatment, but in clinical practice, it is difficult to predict. Machine learning (ML) models have shown promising results in predicting outcomes, yet the lack of transparency in complex models known as “black-box” has made clinicians wary of relying on them in sensitive decisions. Therefore, finding a balance between accuracy and explainability is crucial for the implementation of ML models. The aim of this study was to employ three different ML classifiers to predict the probability of experiencing UI in men with localized prostate cancer 1-year and 2-year after treatment and compare their accuracy and explainability. MethodsWe used the ProZIB dataset from the Netherlands Comprehensive Cancer Organization (Integraal Kankercentrum Nederland; IKNL) which contained clinical, demographic, and PROM data of 964 patients from 65 Dutch hospitals. Logistic Regression (LR), Random Forest (RF), and Support Vector Machine (SVM) algorithms were applied to predict (in)continence after prostate cancer treatment. ResultsAll models have been externally validated according to the TRIPOD Type 3 guidelines and their performance was assessed by accuracy, sensitivity, specificity, and AUC. While all three models demonstrated similar performance, LR showed slightly better accuracy than RF and SVM in predicting the risk of UI one year after prostate cancer treatment, achieving an accuracy of 0.75, a sensitivity of 0.82, and an AUC of 0.79. All models for the 2-year outcome performed poorly in the validation set, with an accuracy of 0.6 for LR, 0.65 for RF, and 0.54 for SVM. ConclusionThe outcomes of our study demonstrate the promise of using non-black box models, such as LR, to assist clinicians in recognizing high-risk patients and making informed treatment choices. The coefficients of the LR model show the importance of each feature in predicting results, and the generated nomogram provides an accessible illustration of how each feature impacts the predicted outcome. Additionally, the model’s simplicity and interpretability make it a more appropriate option in scenarios where comprehending the model’s predictions is essential.

Keywords