BMC Medical Informatics and Decision Making (Apr 2019)

Natural language processing of radiology reports for identification of skeletal site-specific fractures

  • Yanshan Wang,
  • Saeed Mehrabi,
  • Sunghwan Sohn,
  • Elizabeth J. Atkinson,
  • Shreyasee Amin,
  • Hongfang Liu

DOI
https://doi.org/10.1186/s12911-019-0780-5
Journal volume & issue
Vol. 19, no. S3
pp. 23 – 29

Abstract

Read online

Abstract Background Osteoporosis has become an important public health issue. Most of the population, particularly elderly people, are at some degree of risk of osteoporosis-related fractures. Accurate identification and surveillance of patient populations with fractures has a significant impact on reduction of cost of care by preventing future fractures and its corresponding complications. Methods In this study, we developed a rule-based natural language processing (NLP) algorithm for identification of twenty skeletal site-specific fractures from radiology reports. The rule-based NLP algorithm was based on regular expressions developed using MedTagger, an NLP tool of the Apache Unstructured Information Management Architecture (UIMA) pipeline to facilitate information extraction from clinical narratives. Radiology notes were retrieved from the Mayo Clinic electronic health records data warehouse. We developed rules for identifying each fracture type according to physicians’ knowledge and experience, and refined these rules via verification with physicians. This study was approved by the institutional review board (IRB) for human subject research. Results We validated the NLP algorithm using the radiology reports of a community-based cohort at Mayo Clinic with the gold standard constructed by medical experts. The micro-averaged results of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score of the proposed NLP algorithm are 0.930, 1.0, 1.0, 0.941, 0.961, respectively. The F1-score is 1.0 for 8 fractures, and above 0.9 for a total of 17 out of 20 fractures (85%). Conclusions The results verified the effectiveness of the proposed rule-based NLP algorithm in automatic identification of osteoporosis-related skeletal site-specific fractures from radiology reports. The NLP algorithm could be utilized to accurately identify the patients with fractures and those who are also at high risk of future fractures due to osteoporosis. Appropriate care interventions to those patients, not only the most at-risk patients but also those with emerging risk, would significantly reduce future fractures.

Keywords