Impact of Hyperparameter Optimization to Enhance Machine Learning Performance: A Case Study on Breast Cancer Recurrence Prediction

Lorena González-Castro; Marcela Chávez; Patrick Duflot; Valérie Bleret; Guilherme Del Fiol; Martín López-Nores

doi:10.3390/app14135909

Applied Sciences (Jul 2024)

Impact of Hyperparameter Optimization to Enhance Machine Learning Performance: A Case Study on Breast Cancer Recurrence Prediction

Lorena González-Castro,
Marcela Chávez,
Patrick Duflot,
Valérie Bleret,
Guilherme Del Fiol,
Martín López-Nores

Affiliations

Lorena González-Castro: School of Telecommunication Engineering, Universidade de Vigo, 36310 Vigo, Spain
Marcela Chávez: Department of Information System Management, CHU of Liège, 4000 Liège, Belgium
Patrick Duflot: Department of Information System Management, CHU of Liège, 4000 Liège, Belgium
Valérie Bleret: Senology Department, CHU of Liège, 4000 Liège, Belgium
Guilherme Del Fiol: Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, UT 84132, USA
Martín López-Nores: atlanTTic Research Center, Department of Telematics Engineering, Universidade de Vigo, 36130 Vigo, Spain

DOI: https://doi.org/10.3390/app14135909
Journal volume & issue: Vol. 14, no. 13
p. 5909

Abstract

Read online

Accurate and early prediction of breast cancer recurrence is crucial to guide medical decisions and treatment success. Machine learning (ML) has shown promise in this domain. However, its effectiveness critically depends on proper hyperparameter setting, a step that is not always performed systematically in the development of ML models. In this study, we aimed to highlight the impact that this process has on the final performance of ML models through a real-world case study by predicting the five-year recurrence of breast cancer patients. We compared the performance of five ML algorithms (Logistic Regression, Decision Tree, Gradient Boosting, eXtreme Gradient Boost, and Deep Neural Network) before and after optimizing their hyperparameters. Simpler algorithms showed better performance using the default hyperparameters. However, after the optimization process, the more complex algorithms demonstrated superior performance. The AUCs obtained before and after adjustment were 0.7 vs. 0.84 for XGB, 0.64 vs. 0.75 for DNN, 0.7 vs. 0.8 for GB, 0.62 vs. 0.7 for DT, and 0.77 vs. 0.72 for LR. The results underscore the critical importance of hyperparameter selection in the development of ML algorithms for the prediction of cancer recurrence. Neglecting this step can undermine the potential of more powerful algorithms and lead to the choice of suboptimal models.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords