JMIR Formative Research (May 2024)
Classifying Self-Reported Rheumatoid Arthritis Flares Using Daily Patient-Generated Data From a Smartphone App: Exploratory Analysis Applying Machine Learning Approaches
Abstract
BackgroundThe ability to predict rheumatoid arthritis (RA) flares between clinic visits based on real-time, longitudinal patient-generated data could potentially allow for timely interventions to avoid disease worsening. ObjectiveThis exploratory study aims to investigate the feasibility of using machine learning methods to classify self-reported RA flares based on a small data set of daily symptom data collected on a smartphone app. MethodsDaily symptoms and weekly flares reported on the Remote Monitoring of Rheumatoid Arthritis (REMORA) smartphone app from 20 patients with RA over 3 months were used. Predictors were several summary features of the daily symptom scores (eg, pain and fatigue) collected in the week leading up to the flare question. We fitted 3 binary classifiers: logistic regression with and without elastic net regularization, a random forest, and naive Bayes. Performance was evaluated according to the area under the curve (AUC) of the receiver operating characteristic curve. For the best-performing model, we considered sensitivity and specificity for different thresholds in order to illustrate different ways in which the predictive model could behave in a clinical setting. ResultsThe data comprised an average of 60.6 daily reports and 10.5 weekly reports per participant. Participants reported a median of 2 (IQR 0.75-4.25) flares each over a median follow-up time of 81 (IQR 79-82) days. AUCs were broadly similar between models, but logistic regression with elastic net regularization had the highest AUC of 0.82. At a cutoff requiring specificity to be 0.80, the corresponding sensitivity to detect flares was 0.60 for this model. The positive predictive value (PPV) in this population was 53%, and the negative predictive value (NPV) was 85%. Given the prevalence of flares, the best PPV achieved meant only around 2 of every 3 positive predictions were correct (PPV 0.65). By prioritizing a higher NPV, the model correctly predicted over 9 in every 10 non-flare weeks, but the accuracy of predicted flares fell to only 1 in 2 being correct (NPV and PPV of 0.92 and 0.51, respectively). ConclusionsPredicting self-reported flares based on daily symptom scorings in the preceding week using machine learning methods was feasible. The observed predictive accuracy might improve as we obtain more data, and these exploratory results need to be validated in an external cohort. In the future, analysis of frequently collected patient-generated data may allow us to predict flares before they unfold, opening opportunities for just-in-time adaptative interventions. Depending on the nature and implication of an intervention, different cutoff values for an intervention decision need to be considered, as well as the level of predictive certainty required.