BMC Neurology (Aug 2024)

Machine learning characterization of a rare neurologic disease via electronic health records: a proof-of-principle study on stiff person syndrome

  • Soo Hwan Park,
  • Seo Ho Song,
  • Frederick Burton,
  • Cybèle Arsan,
  • Barbara Jobst,
  • Mary Feldman

DOI
https://doi.org/10.1186/s12883-024-03760-7
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 7

Abstract

Read online

Abstract Background Despite the frequent diagnostic delays of rare neurologic diseases (RND), it remains difficult to study RNDs and their comorbidities due to their rarity and hence the statistical underpowering. Affecting one to two in a million annually, stiff person syndrome (SPS) is an RND characterized by painful muscle spasms and rigidity. Leveraging underutilized electronic health records (EHR), this study showcased a machine-learning-based framework to identify clinical features that optimally characterize the diagnosis of SPS. Methods A machine-learning-based feature selection approach was employed on 319 items from the past medical histories of 48 individuals (23 with a diagnosis of SPS and 25 controls) with elevated serum autoantibodies against glutamic-acid-decarboxylase-65 (anti-GAD65) in Dartmouth Health’s EHR to determine features with the highest discriminatory power. Each iteration of the algorithm implemented a Support Vector Machine (SVM) model, generating importance scores—SHapley Additive exPlanation (SHAP) values—for each feature and removing one with the least salient. Evaluation metrics were calculated through repeated stratified cross-validation. Results Depression, hypothyroidism, GERD, and joint pain were the most characteristic features of SPS. Utilizing these features, the SVM model attained precision of 0.817 (95% CI 0.795–0.840), sensitivity of 0.766 (95% CI 0.743–0.790), F-score of 0.761 (95% CI 0.744–0.778), AUC of 0.808 (95% CI 0.791–0.825), and accuracy of 0.775 (95% CI 0.759–0.790). Conclusions This framework discerned features that, with further research, may help fully characterize the pathologic mechanism of SPS: depression, hypothyroidism, and GERD may respectively represent comorbidities through common inflammatory, genetic, and dysautonomic links. This methodology could address diagnostic challenges in neurology by uncovering latent associations and generating hypotheses for RNDs.

Keywords