Arthritis Research & Therapy (May 2024)
Early identification of macrophage activation syndrome secondary to systemic lupus erythematosus with machine learning
Abstract
Abstract Objective The macrophage activation syndrome (MAS) secondary to systemic lupus erythematosus (SLE) is a severe and life-threatening complication. Early diagnosis of MAS is particularly challenging. In this study, machine learning models and diagnostic scoring card were developed to aid in clinical decision-making using clinical characteristics. Methods We retrospectively collected clinical data from 188 patients with either SLE or the MAS secondary to SLE. 13 significant clinical predictor variables were filtered out using the Least Absolute Shrinkage and Selection Operator (LASSO). These variables were subsequently utilized as inputs in five machine learning models. The performance of the models was evaluated using the area under the receiver operating characteristic curve (ROC-AUC), F1 score, and F2 score. To enhance clinical usability, we developed a diagnostic scoring card based on logistic regression (LR) analysis and Chi-Square binning, establishing probability thresholds and stratification for the card. Additionally, this study collected data from four other domestic hospitals for external validation. Results Among all the machine learning models, the LR model demonstrates the highest level of performance in internal validation, achieving a ROC-AUC of 0.998, an F1 score of 0.96, and an F2 score of 0.952. The score card we constructed identifies the probability threshold at a score of 49, achieving a ROC-AUC of 0.994 and an F2 score of 0.936. The score results were categorized into five groups based on diagnostic probability: extremely low (below 5%), low (5–25%), normal (25–75%), high (75–95%), and extremely high (above 95%). During external validation, the performance evaluation revealed that the Support Vector Machine (SVM) model outperformed other models with an AUC value of 0.947, and the scorecard model has an AUC of 0.915. Additionally, we have established an online assessment system for early identification of MAS secondary to SLE. Conclusion Machine learning models can significantly improve the diagnostic accuracy of MAS secondary to SLE, and the diagnostic scorecard model can facilitate personalized probabilistic predictions of disease occurrence in clinical environments.
Keywords