Journal of Medical Internet Research (Mar 2023)

Machine Learning Approaches for Predicting Psoriatic Arthritis Risk Using Electronic Medical Records: Population-Based Study

  • Leon Tsung-Ju Lee,
  • Hsuan-Chia Yang,
  • Phung Anh Nguyen,
  • Muhammad Solihuddin Muhtar,
  • Yu-Chuan Jack Li

DOI
https://doi.org/10.2196/39972
Journal volume & issue
Vol. 25
p. e39972

Abstract

Read online

BackgroundPsoriasis (PsO) is a chronic, systemic, immune-mediated disease with multiorgan involvement. Psoriatic arthritis (PsA) is an inflammatory arthritis that is present in 6%-42% of patients with PsO. Approximately 15% of patients with PsO have undiagnosed PsA. Predicting patients with a risk of PsA is crucial for providing them with early examination and treatment that can prevent irreversible disease progression and function loss. ObjectiveThe aim of this study was to develop and validate a prediction model for PsA based on chronological large-scale and multidimensional electronic medical records using a machine learning algorithm. MethodsThis case-control study used Taiwan’s National Health Insurance Research Database from January 1, 1999, to December 31, 2013. The original data set was split into training and holdout data sets in an 80:20 ratio. A convolutional neural network was used to develop a prediction model. This model used 2.5-year diagnostic and medical records (inpatient and outpatient) with temporal-sequential information to predict the risk of PsA for a given patient within the next 6 months. The model was developed and cross-validated using the training data and was tested using the holdout data. An occlusion sensitivity analysis was performed to identify the important features of the model. ResultsThe prediction model included a total of 443 patients with PsA with earlier diagnosis of PsO and 1772 patients with PsO without PsA for the control group. The 6-month PsA risk prediction model that uses sequential diagnostic and drug prescription information as a temporal phenomic map yielded an area under the receiver operating characteristic curve of 0.70 (95% CI 0.559-0.833), a mean sensitivity of 0.80 (SD 0.11), a mean specificity of 0.60 (SD 0.04), and a mean negative predictive value of 0.93 (SD 0.04). ConclusionsThe findings of this study suggest that the risk prediction model can identify patients with PsO at a high risk of PsA. This model may help health care professionals to prioritize treatment for target high-risk populations and prevent irreversible disease progression and functional loss.