Validation of a Natural Language Processing Algorithm for the Extraction of the Sleep Parameters from the Polysomnography Reports

Mahbubur Rahman; Sara Nowakowski; Ritwick Agrawal; Aanand Naik; Amir Sharafkhaneh; Javad Razjouyan

doi:10.3390/healthcare10101837

Healthcare (Sep 2022)

Validation of a Natural Language Processing Algorithm for the Extraction of the Sleep Parameters from the Polysomnography Reports

Mahbubur Rahman,
Sara Nowakowski,
Ritwick Agrawal,
Aanand Naik,
Amir Sharafkhaneh,
Javad Razjouyan

Affiliations

Mahbubur Rahman: Houston Veterans Affairs Health Services Research and Development Service, Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veteran Affairs Medical Center, Houston, TX 77030, USA
Sara Nowakowski: Houston Veterans Affairs Health Services Research and Development Service, Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veteran Affairs Medical Center, Houston, TX 77030, USA
Ritwick Agrawal: Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
Aanand Naik: Houston Veterans Affairs Health Services Research and Development Service, Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veteran Affairs Medical Center, Houston, TX 77030, USA
Amir Sharafkhaneh: Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
Javad Razjouyan: Houston Veterans Affairs Health Services Research and Development Service, Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veteran Affairs Medical Center, Houston, TX 77030, USA

DOI: https://doi.org/10.3390/healthcare10101837
Journal volume & issue: Vol. 10, no. 10
p. 1837

Abstract

Read online

Background: There is a need to better understand the association between sleep and chronic diseases. In this study we developed a natural language processing (NLP) algorithm to mine polysomnography (PSG) free-text notes from electronic medical records (EMR) and evaluated the performance. Methods: Using the Veterans Health Administration EMR, we identified 46,093 PSG studies using CPT code 95,810 from 1 October 2000–30 September 2019. We randomly selected 200 notes to compare the accuracy of the NLP algorithm in mining sleep parameters including total sleep time (TST), sleep efficiency (SE) and sleep onset latency (SOL), wake after sleep onset (WASO), and apnea-hypopnea index (AHI) compared to visual inspection by raters masked to the NLP output. Results: The NLP performance on the training phase was >0.90 for precision, recall, and F-1 score for TST, SOL, SE, WASO, and AHI. The NLP performance on the test phase was >0.90 for precision, recall, and F-1 score for TST, SOL, SE, WASO, and AHI. Conclusions: This study showed that NLP is an accurate technique to extract sleep parameters from PSG reports in the EMR. Thus, NLP can serve as an effective tool in large health care systems to evaluate and improve patient care.

Published in Healthcare

ISSN: 2227-9032 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Medicine
Website: http://www.mdpi.com/journal/healthcare

About the journal

Abstract

Keywords