IEEE Access (Jan 2022)
Named Entity Recognition for Chinese Electronic Medical Records Based on Multitask and Transfer Learning
Abstract
Current work on named entities for Chinese electronic medical records requires training a separate model for each different type of electronic medical record, the performance of which depends on the amount of training data available for each dataset. However, different types of electronic medical records share similar semantic information with each other, while current models do not take full advantage of this potentially common knowledge. To overcome the mentioned problem, we propose a multi-task learning framework to transfer multiple types of electronic medical records through a shared encoder. Experiments demonstrate that our model achieves substantially better performance compared with the single-task model based on BERT. F1 scores improved by more than 1% on average across the four datasets, with individual datasets improving precision by more than 3.5%. Further analysis shows that our model still achieves better F1 scores on long tail datasets and small size datasets.
Keywords