Tehran University Medical Journal (Jan 2022)
The prediction of lymphedema via the combination of the selected data mining algorithms
Abstract
Background: Breast cancer is the second leading cause of cancer death in women, after lung cancer. Due to the importance of predicting this disease, the use of data mining methods in medical research is more significant than before. Data mining algorithms can be a great help in preventing the development of lymphedema in patients. The aim Of this study was to create a diagnosis system that can predict the probability of lymphedema in breast cancer patients. Methods: In the present study, the factors of lymphedema in 1117 patients with breast cancer have been collected. The likelihood of developing lymphedema is predicted using ensemble learning via 5 heterogeneous classification algorithms, feature selection and the genetic algorithm (The Two-layer Ensemble Feature Selection method). After collecting the data of patients with breast cancer from 2009 to 2018, and data preprocessing using the optimized ensemble learning algorithm and feature selection, we will examine the likelihood of developing lymphedema for the new patient. Finally, the factors affecting the disease have been extracted. Excluding the time of collecting statistical data, the period of the study was from September 2019 to February 2021. This study is performed at Seyed Khandan Rehabilitation Center, Tehran, Iran. Results: The results of algorithms showed that the accuracy of the ensemble learning method with selected classification algorithms (SVM with RBF kernel) is 87% and the accuracy of the ensemble learning with feature selection method is 90%. According to the final evaluation of the proposed method, the most effective risk factors for lymphedema have been extracted. Conclusion: Unfortunately, treatment and diagnosis are not without complications, and one of the most important of these complications in breast cancer is lymphedema in the upper extremities, which can affect the quality of life in patients. It is essential to have a method that can accurately suggest to a specialist whether a new patient will develop lymphedema in the future or how likely it is to develop it, using patient’s own clinical and demographic characteristics.