Frontiers in Neurology (Feb 2023)

Benchmarking performance of an automatic polysomnography scoring system in a population with suspected sleep disorders

  • Bryan Peide Choo,
  • Yingjuan Mok,
  • Yingjuan Mok,
  • Hong Choon Oh,
  • Hong Choon Oh,
  • Hong Choon Oh,
  • Amiya Patanaik,
  • Kishan Kishan,
  • Animesh Awasthi,
  • Siddharth Biju,
  • Soumya Bhattacharjee,
  • Yvonne Poh,
  • Hang Siang Wong,
  • Hang Siang Wong

DOI
https://doi.org/10.3389/fneur.2023.1123935
Journal volume & issue
Vol. 14

Abstract

Read online

AimThe current gold standard for measuring sleep disorders is polysomnography (PSG), which is manually scored by a sleep technologist. Scoring a PSG is time-consuming and tedious, with substantial inter-rater variability. A deep-learning-based sleep analysis software module can perform autoscoring of PSG. The primary objective of the study is to validate the accuracy and reliability of the autoscoring software. The secondary objective is to measure workflow improvements in terms of time and cost via a time motion study.MethodologyThe performance of an automatic PSG scoring software was benchmarked against the performance of two independent sleep technologists on PSG data collected from patients with suspected sleep disorders. The technologists at the hospital clinic and a third-party scoring company scored the PSG records independently. The scores were then compared between the technologists and the automatic scoring system. An observational study was also performed where the time taken for sleep technologists at the hospital clinic to manually score PSGs was tracked, along with the time taken by the automatic scoring software to assess for potential time savings.ResultsPearson's correlation between the manually scored apnea–hypopnea index (AHI) and the automatically scored AHI was 0.962, demonstrating a near-perfect agreement. The autoscoring system demonstrated similar results in sleep staging. The agreement between automatic staging and manual scoring was higher in terms of accuracy and Cohen's kappa than the agreement between experts. The autoscoring system took an average of 42.7 s to score each record compared with 4,243 s for manual scoring. Following a manual review of the auto scores, an average time savings of 38.6 min per PSG was observed, amounting to 0.25 full-time equivalent (FTE) savings per year.ConclusionThe findings indicate a potential for a reduction in the burden of manual scoring of PSGs by sleep technologists and may be of operational significance for sleep laboratories in the healthcare setting.

Keywords