IEEE Access (Jan 2020)

A Boundary Assembling Method for Nested Biomedical Named Entity Recognition

  • Yanping Chen,
  • Ying Hu,
  • Yijing Li,
  • Ruizhang Huang,
  • Yongbin Qin,
  • Yuefei Wu,
  • Qinghua Zheng,
  • Ping Chen

DOI
https://doi.org/10.1109/ACCESS.2020.3040182
Journal volume & issue
Vol. 8
pp. 214141 – 214152

Abstract

Read online

Biomedical named entity recognition (BNER) is an important task in biomedical natural language processing, in which neologisms (new terms, words) are coined constantly. Most of the existing work can only identify biomedical named entities with flattened structures and ignore nested biomedical named entities and discontinuous biomedical named entities. Because biomedical domains often use nested structures to represent semantic information of named entities, existing methods fail to utilize abundant information when processing biomedical texts. This paper focuses on identifying nested biomedical named entities using a boundary assembly (BA) model, which is a cascading framework consisting of three steps. First, start and end named entity boundaries are identified and then assembled into named entity candidates. Finally, a classifier is implemented for filtering false named entities. Our approach is effective in handling nesting and discontinuous problems in biomedical named entity recognition tasks. It improves the performance considerably, achieving an F1-score of 81.34% on the GENIA dataset.

Keywords