Healthcare (Sep 2022)

Validation of a Natural Language Processing Algorithm for the Extraction of the Sleep Parameters from the Polysomnography Reports

  • Mahbubur Rahman,
  • Sara Nowakowski,
  • Ritwick Agrawal,
  • Aanand Naik,
  • Amir Sharafkhaneh,
  • Javad Razjouyan

DOI
https://doi.org/10.3390/healthcare10101837
Journal volume & issue
Vol. 10, no. 10
p. 1837

Abstract

Read online

Background: There is a need to better understand the association between sleep and chronic diseases. In this study we developed a natural language processing (NLP) algorithm to mine polysomnography (PSG) free-text notes from electronic medical records (EMR) and evaluated the performance. Methods: Using the Veterans Health Administration EMR, we identified 46,093 PSG studies using CPT code 95,810 from 1 October 2000–30 September 2019. We randomly selected 200 notes to compare the accuracy of the NLP algorithm in mining sleep parameters including total sleep time (TST), sleep efficiency (SE) and sleep onset latency (SOL), wake after sleep onset (WASO), and apnea-hypopnea index (AHI) compared to visual inspection by raters masked to the NLP output. Results: The NLP performance on the training phase was >0.90 for precision, recall, and F-1 score for TST, SOL, SE, WASO, and AHI. The NLP performance on the test phase was >0.90 for precision, recall, and F-1 score for TST, SOL, SE, WASO, and AHI. Conclusions: This study showed that NLP is an accurate technique to extract sleep parameters from PSG reports in the EMR. Thus, NLP can serve as an effective tool in large health care systems to evaluate and improve patient care.

Keywords