Span-based model for overlapping entity recognition and multi-relations classification in the food domain

Mengqi Zhang; Lei Ma; Yanzhao Ren; Ganggang Zhang; Xinliang Liu

doi:10.3934/mbe.2022240

Mathematical Biosciences and Engineering (Mar 2022)

Span-based model for overlapping entity recognition and multi-relations classification in the food domain

Mengqi Zhang ,
Lei Ma,
Yanzhao Ren,
Ganggang Zhang ,
Xinliang Liu

Affiliations

Mengqi Zhang: 1. School of E-business and Logistics, Beijing Technology and Business University, Beijing 100048, China 2. National Engineering Laboratory for Agri-product Quality Traceability, Beijing Technology and Business University, Beijing 100048, China
Lei Ma: 1. School of E-business and Logistics, Beijing Technology and Business University, Beijing 100048, China 2. National Engineering Laboratory for Agri-product Quality Traceability, Beijing Technology and Business University, Beijing 100048, China
Yanzhao Ren: 3. School of Computer Science and Engineering, Beijing Technology and Business University, Beijing 100048, China
Ganggang Zhang: 4. Digital Campus Construction Center, Capital Normal University, Beijing 100048, China
Xinliang Liu: 1. School of E-business and Logistics, Beijing Technology and Business University, Beijing 100048, China 2. National Engineering Laboratory for Agri-product Quality Traceability, Beijing Technology and Business University, Beijing 100048, China

DOI: https://doi.org/10.3934/mbe.2022240
Journal volume & issue: Vol. 19, no. 5
pp. 5134 – 5152

Abstract

Read online

Information extraction (IE) is an important part of the entire knowledge graph lifecycle. In the food domain, extracting information such as ingredient and cooking method from Chinese recipes is crucial to safety risk analysis and identification of ingredient. In comparison with English, due to the complex structure, the richness of information in word combination, and lack of tense, Chinese IE is much more challenging. This dilemma is particularly prominent in the food domain with high-density knowledge, imprecise syntactic structure. However, existing IE methods focus only on the features of entities in a sentence, such as context and position, and ignore features of the entity itself and the influence of self attributes on prediction of inter entity relationship. To solve the problems of overlapping entity recognition and multi-relations classification in the food domain, we propose a span-based model known as SpIE for IE. The SpIE uses the span representation for each possible candidate entity to capture span-level features, which transforms named entity recognition (NER) into a classification mission. Besides, SpIE feeds extra information about the entity into the relation classification (RC) model by considering the effect of entity's attributes (both the entity mention and entity type) on the relationship between entity pairs. We apply SpIE on two datasets and observe that SpIE significantly outperforms the previous neural approaches due to capture the feature of overlapping entity and entity attributes, and it remains very competitive in general IE.

Published in Mathematical Biosciences and Engineering

ISSN: 1551-0018 (Online)
Publisher: AIMS Press
Country of publisher: United States
LCC subjects: Technology: Chemical technology: Biotechnology; Science: Mathematics
Website: https://www.aimspress.com/journal/MBE

About the journal

Abstract

Keywords