BioMedical Engineering OnLine (Nov 2018)

Comparison of named entity recognition methodologies in biomedical documents

  • Hye-Jeong Song,
  • Byeong-Cheol Jo,
  • Chan-Young Park,
  • Jong-Dae Kim,
  • Yu-Seop Kim

DOI
https://doi.org/10.1186/s12938-018-0573-6
Journal volume & issue
Vol. 17, no. S2
pp. 1 – 14

Abstract

Read online

Abstract Background Biomedical named entity recognition (Bio-NER) is a fundamental task in handling biomedical text terms, such as RNA, protein, cell type, cell line, and DNA. Bio-NER is one of the most elementary and core tasks in biomedical knowledge discovery from texts. The system described here is developed by using the BioNLP/NLPBA 2004 shared task. Experiments are conducted on a training and evaluation set provided by the task organizers. Results Our results show that, compared with a baseline having a 70.09% F1 score, the RNN Jordan- and Elman-type algorithms have F1 scores of approximately 60.53% and 58.80%, respectively. When we use CRF as a machine learning algorithm, CCA, GloVe, and Word2Vec have F1 scores of 72.73%, 72.74%, and 72.82%, respectively. Conclusions By using the word embedding constructed through the unsupervised learning, the time and cost required to construct the learning data can be saved.

Keywords