Deep-ADCA: Development and Validation of Deep Learning Model for Automated Diagnosis Code Assignment Using Clinical Notes in Electronic Medical Records

Jakir Hossain Bhuiyan Masud; Chiang Shun; Chen-Cheng Kuo; Md. Mohaimenul Islam; Chih-Yang Yeh; Hsuan-Chia Yang; Ming-Chin Lin

doi:10.3390/jpm12050707

Journal of Personalized Medicine (Apr 2022)

Deep-ADCA: Development and Validation of Deep Learning Model for Automated Diagnosis Code Assignment Using Clinical Notes in Electronic Medical Records

Jakir Hossain Bhuiyan Masud,
Chiang Shun,
Chen-Cheng Kuo,
Md. Mohaimenul Islam,
Chih-Yang Yeh,
Hsuan-Chia Yang,
Ming-Chin Lin

Affiliations

Jakir Hossain Bhuiyan Masud: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
Chiang Shun: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
Chen-Cheng Kuo: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
Md. Mohaimenul Islam: International Center for Health Information Technology (ICHIT), College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
Chih-Yang Yeh: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
Hsuan-Chia Yang: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
Ming-Chin Lin: Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan

DOI: https://doi.org/10.3390/jpm12050707
Journal volume & issue: Vol. 12, no. 5
p. 707

Abstract

Read online

Currently, the International Classification of Diseases (ICD) codes are being used to improve clinical, financial, and administrative performance. Inaccurate ICD coding can lower the quality of care, and delay or prevent reimbursement. However, selecting the appropriate ICD code from a patient’s clinical history is time-consuming and requires expert knowledge. The rapid spread of electronic medical records (EMRs) has generated a large amount of clinical data and provides an opportunity to predict ICD codes using deep learning models. The main objective of this study was to use a deep learning-based natural language processing (NLP) model to accurately predict ICD-10 codes, which could help providers to make better clinical decisions and improve their level of service. We retrospectively collected clinical notes from five outpatient departments (OPD) from one university teaching hospital between January 2016 and December 2016. We applied NLP techniques, including global vectors, word to vectors, and embedding techniques to process the data. The dataset was split into two independent training and testing datasets consisting of 90% and 10% of the entire dataset, respectively. A convolutional neural network (CNN) model was developed, and the performance was measured using the precision, recall, and F-score. A total of 21,953 medical records were collected from 5016 patients. The performance of the CNN model for the five different departments was clinically satisfactory (Precision: 0.50~0.69 and recall: 0.78~0.91). However, the CNN model achieved the best performance for the cardiology department, with a precision of 69%, a recall of 89% and an F-score of 78%. The CNN model for predicting ICD-10 codes provides an opportunity to improve the quality of care. Implementing this model in real-world clinical settings could reduce the manual coding workload, enhance the efficiency of clinical coding, and support physicians in making better clinical decisions.

Published in Journal of Personalized Medicine

ISSN: 2075-4426 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Medicine
Website: http://www.mdpi.com/journal/jpm

About the journal

Abstract

Keywords