Cancer Medicine (Sep 2024)
Utilizing patient data: A tutorial on predicting second cancer with machine learning models
Abstract
Abstract Background The article explores the potential risk of secondary cancer (SC) due to radiation therapy (RT) and highlights the necessity for new modeling techniques to mitigate this risk. Methods By employing machine learning (ML) models, specifically decision trees, in the research process, a practical framework is established for forecasting the occurrence of SC using patient data. Results & Discussion This framework aids in categorizing patients into high‐risk or low‐risk groups, thereby enabling personalized treatment plans and interventions. The paper also underscores the many factors that contribute to the likelihood of SC, such as radiation dosage, patient age, and genetic predisposition, while emphasizing the limitations of current models in encompassing all relevant parameters. These limitations arise from the non‐linear dependencies between variables and the failure to consider factors such as genetics, hormones, lifestyle, radiation from secondary particles, and imaging dosage. To instruct and assess ML models for predicting the occurrence of SC based on patient data, the paper utilizes a dataset consisting of instances and attributes. Conclusion The practical implications of this research lie in enhancing our understanding and prediction of SC following RT, facilitating personalized treatment approaches, and establishing a framework for leveraging patient data within the realm of ML models.
Keywords