BMC Cancer (Nov 2024)
Prediction model for ocular metastasis of breast cancer: machine learning model development and interpretation study
Abstract
Abstract Background Breast cancer (BC) is caused by the uncontrolled proliferation of breast epithelial cells followed by malignant transformation, and it has the highest incidence among female malignant tumors. The metastasis of BC occurs through direct and lymphatic spread. Although ocular metastasis is relatively rare, it is a good indicator of a worse prognosis. We used machine learning (ML) to establish a model to analyze the risk factors of BC eye metastasis. Methods The clinical data of 2225 patients with BC from 2003 to 2019 were collected and randomly classified into the training and test sets using a ratio of 7:3. Based on the presence or absence of eye metastasis, the patients with BC were classified into the ocular metastasis (OM) and non-ocular metastasis (NOM) groups. Univariate and multivariate logistic regression analyses and least absolute shrinkage and selection operator (LASSO) were conducted. We used six ML algorithms to establish a predictive BC model and used 10-fold cross-validation for internal verification. The area under the receiver operating characteristic (ROC) curve was used to evaluate the predictive ability of the model. In addition, we established a web hazard calculator depending on the best-performing model to facilitate its clinical application. Shapley additive interpretation (SHAP) was used to determine the risk factors and the interpretability of the black box model. Results Univariate logistic regression analysis showed that histopathology (other types), axillary lymph node metastasis (ALNM) (> 4), Ca2+, total cholesterol (TC), low-density lipoprotein (LDL), apolipoprotein A (ApoA), carcinoembryonic antigen (CEA), carbohydrate antigen (CA) 125, CA153, CA199, alkaline phosphatase (ALP), and hemoglobin (Hb) were risk factors for BC eye metastasis. Multivariate logistic regression analysis showed that CA153, ApoA, and LDL were hazardous components for BC eye metastasis. LASSO showed that ALNM, LDL, CA125, Hb, ALP, and CA199 were the first six key variables that were useful for the diagnosis of ocular metastasis in breast cancer. Bootstrapped aggregation (BAG) demonstrated the discriminative ability (area under ROC curve [AUC] = 0.992, accuracy = 0.953, sensitivity = 0.987). Based on this, we applied the BAG machine learning model to build an online web computing system to help clinicians assist in determining the risk of BC eye metastasis. In addition, two typical cases are analyzed to determine the interpretability of the model. Conclusion We used ML to establish a risk prediction model for BC ocular metastasis, and BAG showed the greatest performance. The model can predict the risk of OM in patients with BC, facilitate early and timely diagnosis and treatment, and reduce the burden on society.
Keywords