IEEE Access (Jan 2022)
Attention Retrieval Model for Entity Relation Extraction From Biological Literature
Abstract
Natural Language Processing (NLP) has contributed to extracting relationships among biological entities, such as genes, their mutations, proteins, diseases, processes, phenotypes, and drugs, for a comprehensive and concise understanding of information in the literature. Self-attention-based models for Relationship Extraction (RE) have played an increasingly important role in NLP. However, self-attention models for RE are framed as a classification problem, which limits its practical usability in several ways. We present an alternative framework called the Attention Retrieval Model (ARM), which enhances the applicability of attention-based models compared to the regular classification approach, for RE. Given a text sequence containing related entities/keywords, ARM learns the association between a chosen entity/keyword with the other entities present in the sequence, using an underlying self-attention mechanism. ARM provides a flexible framework for a modeller to customise their model, facilitate data integration, and integrate expert knowledge to provide a more practical approach for RE. ARM can extract unseen relationships that are not annotated in the training data, analogous to zero-shot learning. To sum up, ARM provides an alternative self-attention-based deep learning framework for RE, that can capture directed entity relationships.
Keywords