Named entity recognition of Chinese electronic medical records based on multifeature embedding and attention mechanism

Dun-wei GONG; Yong-kai ZHANG; Yi-nan GUO; Bin WANG; Kuan-lu FAN; Yan HUO

doi:10.13374/j.issn2095-9389.2021.01.12.006

工程科学学报 (Sep 2021)

Named entity recognition of Chinese electronic medical records based on multifeature embedding and attention mechanism

Dun-wei GONG,
Yong-kai ZHANG,
Yi-nan GUO,
Bin WANG,
Kuan-lu FAN,
Yan HUO

Affiliations

Dun-wei GONG: School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
Yong-kai ZHANG: School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
Yi-nan GUO: School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
Bin WANG: School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
Kuan-lu FAN: Department of Endocrinology, the Second Affiliated Hospital of Xuzhou Medical University, Xuzhou 221000, China
Yan HUO: Department of Endocrinology, Affiliated Hospital of China University of Mining and Technology, Xuzhou 221116, China

DOI: https://doi.org/10.13374/j.issn2095-9389.2021.01.12.006
Journal volume & issue: Vol. 43, no. 9
pp. 1190 – 1196

Abstract

Read online

Medical records, as an essential part of the health care records of residents, save all the information about the clinical treatment of patients, which are traditionally written by doctors on paper. With the development of information technologies, electronic medical records that are more easily saved and managed gradually replace the traditional ones. Intelligent auxiliary diagnosis, patients’ portrait construction, and disease prediction based on medical reports have become research hotspots in the field of intelligent medical care. To fully discover the hidden relationship between symptoms and diseases from the documents saved in electronic medical records, the development of an efficient named entity recognition algorithm is the key issue. Although several studies have been conducted on it, there is relatively little research on the information extraction of Chinese electronic medical records. To the best of our knowledge, the documents in Chinese electronic medical records contain a large number of nested named entities and short sentences. Moreover, there is weak logic among the sentences, causing a complex syntax structure. To effectively recognize the medical entities, a novel named entity recognition method based on multifeature embedding and attention mechanism was proposed. After embedding three types of features derived from characters, words, and glyphs in the input presentation layer, an attention machine was introduced to the hidden layer of the bidirectional long short-term memory network to make the model focus on the characters related to the medical entities. Finally, the optimal labels for the five types of entities in Chinese electronic medical records, including diseases, body parts, symptoms, drugs, and operations, were obtained. The experimental results for the open and self-built Chinese electronic medical records, recognition accuracy, recall rate, and F1 value of the proposed algorithm are all better than 97%, which shows that the proposed algorithm can effectively identify various entities in Chinese electronic medical records.

Published in 工程科学学报

ISSN: 2095-9389 (Print)
Publisher: Science Press
Country of publisher: China
LCC subjects: Technology: Mining engineering. Metallurgy; Technology: Engineering (General). Civil engineering (General): Environmental engineering
Website: https://cje.ustb.edu.cn/indexen.htm

About the journal

Abstract

Keywords