Applied Sciences (Jul 2024)

Impact of Hyperparameter Optimization to Enhance Machine Learning Performance: A Case Study on Breast Cancer Recurrence Prediction

  • Lorena González-Castro,
  • Marcela Chávez,
  • Patrick Duflot,
  • Valérie Bleret,
  • Guilherme Del Fiol,
  • Martín López-Nores

DOI
https://doi.org/10.3390/app14135909
Journal volume & issue
Vol. 14, no. 13
p. 5909

Abstract

Read online

Accurate and early prediction of breast cancer recurrence is crucial to guide medical decisions and treatment success. Machine learning (ML) has shown promise in this domain. However, its effectiveness critically depends on proper hyperparameter setting, a step that is not always performed systematically in the development of ML models. In this study, we aimed to highlight the impact that this process has on the final performance of ML models through a real-world case study by predicting the five-year recurrence of breast cancer patients. We compared the performance of five ML algorithms (Logistic Regression, Decision Tree, Gradient Boosting, eXtreme Gradient Boost, and Deep Neural Network) before and after optimizing their hyperparameters. Simpler algorithms showed better performance using the default hyperparameters. However, after the optimization process, the more complex algorithms demonstrated superior performance. The AUCs obtained before and after adjustment were 0.7 vs. 0.84 for XGB, 0.64 vs. 0.75 for DNN, 0.7 vs. 0.8 for GB, 0.62 vs. 0.7 for DT, and 0.77 vs. 0.72 for LR. The results underscore the critical importance of hyperparameter selection in the development of ML algorithms for the prediction of cancer recurrence. Neglecting this step can undermine the potential of more powerful algorithms and lead to the choice of suboptimal models.

Keywords