Frontiers in Public Health (Oct 2022)
Machine learning approaches for prediction of early death among lung cancer patients with bone metastases using routine clinical characteristics: An analysis of 19,887 patients
Abstract
PurposeBone is one of the most common sites for the spread of malignant tumors. Patients with bone metastases whose prognosis was shorter than 3 months (early death) were considered as surgical contraindications. However, the information currently available in the literature limits our capacity to assess the risk likelihood of 3 month mortality. As a result, the study's objective is to create an accurate prediction model utilizing machine-learning techniques to predict 3 month mortality specifically among lung cancer patients with bone metastases according to easily available clinical data.MethodsThis study enrolled 19,887 lung cancer patients with bone metastases between 2010 and 2018 from a large oncologic database in the United States. According to a ratio of 8:2, the entire patient cohort was randomly assigned to a training (n = 15881, 80%) and validation (n = 4,006, 20%) group. In the training group, prediction models were trained and optimized using six approaches, including logistic regression, XGBoosting machine, random forest, neural network, gradient boosting machine, and decision tree. There were 13 metrics, including the Brier score, calibration slope, intercept-in-large, area under the curve (AUC), and sensitivity, used to assess the model's prediction performance in the validation group. In each metric, the best prediction effectiveness was assigned six points, while the worst was given one point. The model with the highest sum score of the 13 measures was optimal. The model's explainability was performed using the local interpretable model-agnostic explanation (LIME) according to the optimal model. Predictor importance was assessed using H2O automatic machine learning. Risk stratification was also evaluated based on the optimal threshold.ResultsAmong all recruited patients, the 3 month mortality was 48.5%. Twelve variables, including age, primary site, histology, race, sex, tumor (T) stage, node (N) stage, brain metastasis, liver metastasis, cancer-directed surgery, radiation, and chemotherapy, were significantly associated with 3 month mortality based on multivariate analysis, and these variables were included for developing prediction models. With the highest sum score of all the measurements, the gradient boosting machine approach outperformed all the other models (62 points), followed by the XGBooting machine approach (59 points) and logistic regression (53). The area under the curve (AUC) was 0.820 (95% confident interval [CI]: 0.807–0.833), 0.820 (95% CI: 0.807–0.833), and 0.815 (95% CI: 0.801–0.828), respectively, calibration slope was 0.97, 0.95, and 0.96, respectively, and accuracy was all 0.772. Explainability of models was conducted to rank the predictors and visualize their contributions to an individual's mortality outcome. The top four important predictors in the population according to H2O automatic machine learning were chemotherapy, followed by liver metastasis, radiation, and brain metastasis. Compared to patients in the low-risk group, patients in the high-risk group were more than three times the odds of dying within 3 months (P < 0.001).ConclusionsUsing machine learning techniques, this study offers a number of models, and the optimal model is found after thoroughly assessing and contrasting the prediction performance of each model. The optimal model can be a pragmatic risk prediction tool and is capable of identifying lung cancer patients with bone metastases who are at high risk for 3 month mortality, informing risk counseling, and aiding clinical treatment decision-making. It is better advised for patients in the high-risk group to have radiotherapy alone, the best supportive care, or minimally invasive procedures like cementoplasty.
Keywords