Frontiers in Neurology (Apr 2023)

Deep learning-based algorithm accurately classifies sleep stages in preadolescent children with sleep-disordered breathing symptoms and age-matched controls

  • Pranavan Somaskandhan,
  • Timo Leppänen,
  • Timo Leppänen,
  • Timo Leppänen,
  • Philip I. Terrill,
  • Sigridur Sigurdardottir,
  • Erna Sif Arnardottir,
  • Erna Sif Arnardottir,
  • Kristín A. Ólafsdóttir,
  • Marta Serwatko,
  • Sigurveig Þ. Sigurðardóttir,
  • Sigurveig Þ. Sigurðardóttir,
  • Michael Clausen,
  • Michael Clausen,
  • Juha Töyräs,
  • Juha Töyräs,
  • Juha Töyräs,
  • Henri Korkalainen,
  • Henri Korkalainen

DOI
https://doi.org/10.3389/fneur.2023.1162998
Journal volume & issue
Vol. 14

Abstract

Read online

IntroductionVisual sleep scoring has several shortcomings, including inter-scorer inconsistency, which may adversely affect diagnostic decision-making. Although automatic sleep staging in adults has been extensively studied, it is uncertain whether such sophisticated algorithms generalize well to different pediatric age groups due to distinctive EEG characteristics. The preadolescent age group (10–13-year-olds) is relatively understudied, and thus, we aimed to develop an automatic deep learning-based sleep stage classifier specifically targeting this cohort.MethodsA dataset (n = 115) containing polysomnographic recordings of Icelandic preadolescent children with sleep-disordered breathing (SDB) symptoms, and age and sex-matched controls was utilized. We developed a combined convolutional and long short-term memory neural network architecture relying on electroencephalography (F4-M1), electrooculography (E1-M2), and chin electromyography signals. Performance relative to human scoring was further evaluated by analyzing intra- and inter-rater agreements in a subset (n = 10) of data with repeat scoring from two manual scorers.ResultsThe deep learning-based model achieved an overall cross-validated accuracy of 84.1% (Cohen’s kappa κ = 0.78). There was no meaningful performance difference between SDB-symptomatic (n = 53) and control subgroups (n = 52) [83.9% (κ = 0.78) vs. 84.2% (κ = 0.78)]. The inter-rater reliability between manual scorers was 84.6% (κ = 0.78), and the automatic method reached similar agreements with scorers, 83.4% (κ = 0.76) and 82.7% (κ = 0.75).ConclusionThe developed algorithm achieved high classification accuracy and substantial agreements with two manual scorers; the performance metrics compared favorably with typical inter-rater reliability between manual scorers and performance reported in previous studies. These suggest that our algorithm may facilitate less labor-intensive and reliable automatic sleep scoring in preadolescent children.

Keywords