Digital Health (Mar 2023)
Validation of the influence of biosignals on performance of machine learning algorithms for sleep stage classification
Abstract
Background Sleep stage identification is critical in multiple areas (e.g. medicine or psychology) to diagnose sleep-related disorders. Previous studies have reported that the performance of machine learning algorithms can be changed depending on the biosignals and feature-extraction processes in sleep stage classification. Methods To compare as many conditions as possible, 414 experimental conditions were applied, considering the combination of different biosignals, biosignal length, and window length. Five biosignals in polysomnography (i.e. electrocardiogram (ECG), electroencephalogram (EEG), electromyogram (EMG), electrooculogram left, and electrooculogram right) were used to identify optimal signal combinations for classification. In addition, three different signal-length conditions and six different window-length conditions were applied. The validity of each condition was examined via classification performance from the XGBoost classifiers trained using 10-fold cross-validation. Furthermore, results considering feature importance were examined to validate the experimental results in terms of model explanation. Results The combination of EEG + EMG + ECG with a 40 s window and 120 s signal length resulted in the best classification performance (precision: 0.853, recall: 0.855, F1-score: 0.853, and accuracy: 0.853). Compared to other conditions and feature importance results, EEG signals showed a relatively higher importance for classification in the present study. Conclusion We determined the optimal biosignal and window conditions for the feature-extraction process in machine learning algorithm-based sleep stage classification. Our experimental results inform researchers in the future conduct of related studies. To generalize our results, more diverse methodologies and conditions should be applied in future studies.