Nature and Science of Sleep (Jun 2022)

End-to-End Sleep Staging Using Nocturnal Sounds from Microphone Chips for Mobile Devices

  • Hong J,
  • Tran HH,
  • Jung J,
  • Jang H,
  • Lee D,
  • Yoon IY,
  • Hong JK,
  • Kim JW

Journal volume & issue
Vol. Volume 14
pp. 1187 – 1201

Abstract

Read online

Joonki Hong,1,2 Hai Hong Tran,1 Jinhwan Jung,1 Hyeryung Jang,3 Dongheon Lee,1 In-Young Yoon,4,5 Jung Kyung Hong,4,5,* Jeong-Whun Kim5,6,* 1Asleep Inc., Seoul, Korea; 2Korea Advanced Institute of Science and Technology, Daejeon, Korea; 3Dongguk University, Seoul, Korea; 4Department of Psychiatry, Seoul National University Bundang Hospital, Seongnam, Korea; 5Seoul National University College of Medicine, Seoul, Korea; 6Department of Otorhinolaryngology, Seoul National University Bundang Hospital, Seongnam, Korea*These authors contributed equally to this workCorrespondence: Jeong-Whun Kim, Department of Otorhinolaryngology, Seoul National University College of Medicine, Seoul National University Bundang Hospital, 82, Gumi-ro 173beon-gil, Bundang-gu, Seongnam, Gyeonggi-do, 463-707, Korea, Email [email protected] Jung Kyung Hong, Department of Psychiatry, Seoul National University College of Medicine, Seoul National University Bundang Hospital, 82, Gumi-ro 173beon-gil, Bundang-gu, Seongnam, Gyeonggi-do, 463-707, Korea, Email [email protected]: Nocturnal sounds contain numerous information and are easily obtainable by a non-contact manner. Sleep staging using nocturnal sounds recorded from common mobile devices may allow daily at-home sleep tracking. The objective of this study is to introduce an end-to-end (sound-to-sleep stages) deep learning model for sound-based sleep staging designed to work with audio from microphone chips, which are essential in mobile devices such as modern smartphones.Patients and Methods: Two different audio datasets were used: audio data routinely recorded by a solitary microphone chip during polysomnography (PSG dataset, N=1154) and audio data recorded by a smartphone (smartphone dataset, N=327). The audio was converted into Mel spectrogram to detect latent temporal frequency patterns of breathing and body movement from ambient noise. The proposed neural network model learns to first extract features from each 30-second epoch and then analyze inter-epoch relationships of extracted features to finally classify the epochs into sleep stages.Results: Our model achieved 70% epoch-by-epoch agreement for 4-class (wake, light, deep, REM) sleep stage classification and robust performance across various signal-to-noise conditions. The model performance was not considerably affected by sleep apnea or periodic limb movement. External validation with smartphone dataset also showed 68% epoch-by-epoch agreement.Conclusion: The proposed end-to-end deep learning model shows potential of low-quality sounds recorded from microphone chips to be utilized for sleep staging. Future study using nocturnal sounds recorded from mobile devices at home environment may further confirm the use of mobile device recording as an at-home sleep tracker.Keywords: respiratory sounds, sleep stages, deep learning, smartphone, polysomnography

Keywords