BMC Medical Informatics and Decision Making (Dec 2019)

Incorporating medical code descriptions for diagnosis prediction in healthcare

  • Fenglong Ma,
  • Yaqing Wang,
  • Houping Xiao,
  • Ye Yuan,
  • Radha Chitta,
  • Jing Zhou,
  • Jing Gao

DOI
https://doi.org/10.1186/s12911-019-0961-2
Journal volume & issue
Vol. 19, no. S6
pp. 1 – 13

Abstract

Read online

Abstract Background Diagnosis aims to predict the future health status of patients according to their historical electronic health records (EHR), which is an important yet challenging task in healthcare informatics. Existing diagnosis prediction approaches mainly employ recurrent neural networks (RNN) with attention mechanisms to make predictions. However, these approaches ignore the importance of code descriptions, i.e., the medical definitions of diagnosis codes. We believe that taking diagnosis code descriptions into account can help the state-of-the-art models not only to learn meaning code representations, but also to improve the predictive performance, especially when the EHR data are insufficient. Methods We propose a simple, but general diagnosis prediction framework, which includes two basic components: diagnosis code embedding and predictive model. To learn the interpretable code embeddings, we apply convolutional neural networks (CNN) to model medical descriptions of diagnosis codes extracted from online medical websites. The learned medical embedding matrix is used to embed the input visits into vector representations, which are fed into the predictive models. Any existing diagnosis prediction approach (referred to as the base model) can be cast into the proposed framework as the predictive model (called the enhanced model). Results We conduct experiments on two real medical datasets: the MIMIC-III dataset and the Heart Failure claim dataset. Experimental results show that the enhanced diagnosis prediction approaches significantly improve the prediction performance. Moreover, we validate the effectiveness of the proposed framework with insufficient EHR data. Finally, we visualize the learned medical code embeddings to show the interpretability of the proposed framework. Conclusions Given the historical visit records of a patient, the proposed framework is able to predict the next visit information by incorporating medical code descriptions.

Keywords