IEEE Access (Jan 2024)

NLP-Powered Healthcare Insights: A Comparative Analysis for Multi-Labeling Classification With MIMIC-CXR Dataset

  • Ege Erberk Uslu,
  • Emine Sezer,
  • Zekeriya Anil Guven

DOI
https://doi.org/10.1109/ACCESS.2024.3400007
Journal volume & issue
Vol. 12
pp. 67314 – 67324

Abstract

Read online

The digitization of the healthcare industry has led to a growing number of applications that use machine learning and image processing techniques to improve the diagnostic process. These applications utilize a variety of medical data, including laboratory results, clinical findings, MRI scans, tomographic images, and radiological images. In addition, free-text healthcare documentation, such as well-structured discharge summaries, contains valuable information. Natural Language Processing encompasses the development of automated systems for generating health reports. This process involves using domain-specific knowledge and prior knowledge to extract relevant information from medical records. This article investigates the use of natural language processing techniques for chest X-ray classification. A total of 14 distinct impressions derived from chest radiography findings from the MIMIC-CXR dataset were used in a multi-label classification procedure. Six distinct language models derived from the BERT language model, along with three distinct classification algorithms, were employed to evaluate the effectiveness of the models and the dataset for multi-label categorization. The experimental results showed a successful prediction rate of 80.47% for 14 distinct impressions within the dataset.

Keywords