JMIR Medical Informatics (Oct 2024)

Multifaceted Natural Language Processing Task–Based Evaluation of Bidirectional Encoder Representations From Transformers Models for Bilingual (Korean and English) Clinical Notes: Algorithm Development and Validation

  • Kyungmo Kim,
  • Seongkeun Park,
  • Jeongwon Min,
  • Sumin Park,
  • Ju Yeon Kim,
  • Jinsu Eun,
  • Kyuha Jung,
  • Yoobin Elyson Park,
  • Esther Kim,
  • Eun Young Lee,
  • Joonhwan Lee,
  • Jinwook Choi

DOI
https://doi.org/10.2196/52897
Journal volume & issue
Vol. 12
pp. e52897 – e52897

Abstract

Read online

Abstract BackgroundThe bidirectional encoder representations from transformers (BERT) model has attracted considerable attention in clinical applications, such as patient classification and disease prediction. However, current studies have typically progressed to application development without a thorough assessment of the model’s comprehension of clinical context. Furthermore, limited comparative studies have been conducted on BERT models using medical documents from non–English-speaking countries. Therefore, the applicability of BERT models trained on English clinical notes to non-English contexts is yet to be confirmed. To address these gaps in literature, this study focused on identifying the most effective BERT model for non-English clinical notes. ObjectiveIn this study, we evaluated the contextual understanding abilities of various BERT models applied to mixed Korean and English clinical notes. The objective of this study was to identify the BERT model that excels in understanding the context of such documents. MethodsUsing data from 164,460 patients in a South Korean tertiary hospital, we pretrained BERT-base, BERT for Biomedical Text Mining (BioBERT), Korean BERT (KoBERT), and Multilingual BERT (M-BERT) to improve their contextual comprehension capabilities and subsequently compared their performances in 7 fine-tuning tasks. ResultsThe model performance varied based on the task and token usage. First, BERT-base and BioBERT excelled in tasks using classification ([CLS]) token embeddings, such as document classification. BioBERT achieved the highest F1F1 ConclusionsThis study highlighted the effectiveness of various BERT models in a multilingual clinical domain. The findings can be used as a reference in clinical and language-based applications.