Heliyon (Sep 2024)

Ensemble neural models for ICD code prediction using unstructured and structured healthcare data

  • Alimurtaza Mustafa Merchant,
  • Naveen Shenoy,
  • Sidharth Lanka,
  • Sowmya Kamath

Journal volume & issue
Vol. 10, no. 17
p. e36569

Abstract

Read online

Disease coding is the process of assigning one or more standardized diagnostic codes to clinical notes that are maintained by health practitioners (e.g. clinicians) to track patient condition. Such a coding process is often expensive and error-prone, as human medical coders primarily perform it. Automating diagnostic coding using Artificial Intelligence is seen as an essential solution in Hospital Information Management Systems and approaches built on Convolutional Neural Networks currently perform best. In this work, a neural model built on unstructured clinical text for enabling automatic diagnostic coding for given patient discharge summaries is proposed. We incorporate a structured self-attention mechanism designed to boost learning of label-specific vectors and the significant clinical text snippets associated with a certain label for this purpose. These vectors are then combined with a novel code description pipeline leveraging the descriptions provided for each standardized diagnostic code. The proposed model achieved best performance in terms of standard metrics over the MIMIC-III dataset, outperforming models based on Longformers and Knowledge graphs. Furthermore, to leverage structured clinical data to enhance the proposed model, and to enable improved diagnostic code prediction, model ensembling is considered. A neural model built on structured data by leveraging supervised machine learning algorithms such as random forest and boosting, is designed for multi-class code classification. Experimental results revealed that the proposed ensemble models show promising performance compared to traditional models that rely solely on unstructured or structured clinical data, emphasizing their suitability for real-world deployment.

Keywords