Applying Deep Learning Model to Predict Diagnosis Code of Medical Records

Jakir Hossain Bhuiyan Masud; Chen-Cheng Kuo; Chih-Yang Yeh; Hsuan-Chia Yang; Ming-Chin Lin

doi:10.3390/diagnostics13132297

Diagnostics (Jul 2023)

Applying Deep Learning Model to Predict Diagnosis Code of Medical Records

Jakir Hossain Bhuiyan Masud,
Chen-Cheng Kuo,
Chih-Yang Yeh,
Hsuan-Chia Yang,
Ming-Chin Lin

Affiliations

Jakir Hossain Bhuiyan Masud: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
Chen-Cheng Kuo: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
Chih-Yang Yeh: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
Hsuan-Chia Yang: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
Ming-Chin Lin: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan

DOI: https://doi.org/10.3390/diagnostics13132297
Journal volume & issue: Vol. 13, no. 13
p. 2297

Abstract

Read online

The International Classification of Diseases (ICD) code is a diagnostic classification standard that is frequently used as a referencing system in healthcare and insurance. However, it takes time and effort to find and use the right diagnosis code based on a patient’s medical records. In response, deep learning (DL) methods have been developed to assist physicians in the ICD coding process. Our findings propose a deep learning model that utilized clinical notes from medical records to predict ICD-10 codes. Our research used text-based medical data from the outpatient department (OPD) of a university hospital from January to December 2016. The dataset used clinical notes from five departments, and a total of 21,953 medical records were collected. Clinical notes consisted of a subjective component, objective component, assessment, plan (SOAP) notes, diagnosis code, and a drug list. The dataset was divided into two groups: 90% for training and 10% for test cases. We applied natural language processing (NLP) technique (word embedding, Word2Vector) to process the data. A deep learning-based convolutional neural network (CNN) model was created based on the information presented above. Three metrics (precision, recall, and F-score) were used to calculate the achievement of the deep learning CNN model. Clinically acceptable results were achieved through the deep learning model for five departments (precision: 0.53–0.96; recall: 0.85–0.99; and F-score: 0.65–0.98). With a precision of 0.95, a recall of 0.99, and an F-score of 0.98, the deep learning model performed the best in the department of cardiology. Our proposed CNN model significantly improved the prediction performance for an automated ICD-10 code prediction system based on prior clinical information. This CNN model could reduce the laborious task of manual coding and could assist physicians in making a better diagnosis.

Published in Diagnostics

ISSN: 2075-4418 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Medicine: Medicine (General)
Website: http://www.mdpi.com/journal/diagnostics

About the journal

Abstract

Keywords