Applied Computer Science (Sep 2024)
A QUALITATIVE AND QUANTITATIVE APPROACH USING MACHINE LEARNING AND NON-MOTOR SYMPTOMS FOR PARKINSON’S DISEASE CLASSIFICATION. A HIERARCHICAL STUDY
Abstract
Parkinson's Disease (PD) is a neurodegenerative disorder that impacts movement, speech, dexterity, and cognition. Clinical assessments primarily diagnose PD, but symptoms' variability often leads to misdiagnosis. This study examines ML algorithms to distinguish Healthy People (HP) from People with Parkinson's Disease (PPD). Data from 106 HP and 106 PPD participants, who underwent the Parkinson’s Disease Sleep Test (PDST), Hopkin’s Verbal Learning Test (HVLT), and Clock Drawing Test (CDT) from the Parkinson's Progression Markers Initiative (PPMI) were used. A custom HYBRID dataset was also created by integrating these 3 datasets. Various Machine Learning (ML) Classification Algorithms (CA) were also studied: Random Forest (RF), Naïve Bayes (NB), Support Vector Machine (SVM), and Logistic Regression (LR). Multiple feature sets: the first quartile (Q1: 25 % most important features), second quartile (Q2: 50 % most important features), third quartile (Q3: 75 % most important features), and fourth quartile (Q4: All 100 % features) were generated using various Feature Selection (FS) algorithms and ensemble mechanisms. Results showed that all the ML CA achieved over 73±8.4 % accuracy with individual datasets, while the proposed HYBRID dataset achieved a remarkable accuracy of 98±0.6 %. This study identified the optimal quantity of non-motor features, dataset, the best FS and CA in hierarchical approach for early PD diagnosis and also proved that PD may be diagnosed with great accuracy by analyzing non-motor PD parameters using ML algorithms. This suggests that extended data collection could serve as a digital biomarker for PD diagnosis in the future.
Keywords