Using Natural Language Processing to Identify Low Back Pain in Imaging Reports

Yeji Kim; Chanyoung Song; Gyuseon Song; Sol Bi Kim; Hyun-Wook Han; Inbo Han

doi:10.3390/app122412521

Applied Sciences (Dec 2022)

Using Natural Language Processing to Identify Low Back Pain in Imaging Reports

Yeji Kim,
Chanyoung Song,
Gyuseon Song,
Sol Bi Kim,
Hyun-Wook Han,
Inbo Han

Affiliations

Yeji Kim: Research Competency Milestones Program of School of Medicine, CHA University School of Medicine, Bundang-gu, Seongnam-si 13488, Republic of Korea
Chanyoung Song: Department of Biomedical Informatics, CHA University School of Medicine, Bundang-gu, Seongnam-si 13488, Republic of Korea
Gyuseon Song: Department of Biomedical Informatics, CHA University School of Medicine, Bundang-gu, Seongnam-si 13488, Republic of Korea
Sol Bi Kim: Department of Neurosurgery, CHA University School of Medicine, CHA Bungdang Medical Center, Seongnam-si 13497, Republic of Korea
Hyun-Wook Han: Department of Biomedical Informatics, CHA University School of Medicine, Bundang-gu, Seongnam-si 13488, Republic of Korea
Inbo Han: Department of Neurosurgery, CHA University School of Medicine, CHA Bungdang Medical Center, Seongnam-si 13497, Republic of Korea

DOI: https://doi.org/10.3390/app122412521
Journal volume & issue: Vol. 12, no. 24
p. 12521

Abstract

Read online

A natural language processing (NLP) pipeline was developed to identify lumbar spine imaging findings associated with low back pain (LBP) in X-radiation (X-ray), computed tomography (CT), and magnetic resonance imaging (MRI) reports. A total of 18,640 report datasets were randomly sampled (stratified by imaging modality) to obtain a balanced sample of 300 X-ray, 300 CT, and 300 MRI reports. A total of 23 radiologic findings potentially related to LBP were defined, and their presence was extracted from radiologic reports. In developing NLP pipelines, section and sentence segmentation from the radiology reports was performed using a rule-based method, including regular expression with negation detection. Datasets were randomly split into 80% for development and 20% for testing to evaluate the model’s extraction performance. The performance of the NLP pipeline was evaluated by using recall, precision, accuracy, and the F1 score. In evaluating NLP model performances, four parameters—recall, precision, accuracy, and F1 score—were greater than 0.9 for all 23 radiologic findings. These four scores were 1.0 for 10 radiologic findings (listhesis, annular fissure, disc bulge, disc extrusion, disc protrusion, endplate edema or Type 1 Modic change, lateral recess stenosis, Schmorl’s node, osteophyte, and any stenosis). In the seven potentially clinically important radiologic findings, the F1 score ranged from 0.9882 to 1.0. In this study, a rule-based NLP system identifying 23 findings related to LBP from X-ray, CT, and MRI reports was developed, and it presented good performance in regards to the four scoring parameters.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords