Self-Attention-Based Models for the Extraction of Molecular Interactions from Biological Texts

Prashant Srivastava; Saptarshi Bej; Kristina Yordanova; Olaf Wolkenhauer

doi:10.3390/biom11111591

Biomolecules (Oct 2021)

Self-Attention-Based Models for the Extraction of Molecular Interactions from Biological Texts

Prashant Srivastava,
Saptarshi Bej,
Kristina Yordanova,
Olaf Wolkenhauer

Affiliations

Prashant Srivastava: Institute of Computer Science, University of Rostock, 18059 Rostock, Germany
Saptarshi Bej: Institute of Computer Science, University of Rostock, 18059 Rostock, Germany
Kristina Yordanova: Institute of Computer Science, University of Rostock, 18059 Rostock, Germany
Olaf Wolkenhauer: Institute of Computer Science, University of Rostock, 18059 Rostock, Germany

DOI: https://doi.org/10.3390/biom11111591
Journal volume & issue: Vol. 11, no. 11
p. 1591

Abstract

Read online

For any molecule, network, or process of interest, keeping up with new publications on these is becoming increasingly difficult. For many cellular processes, the amount molecules and their interactions that need to be considered can be very large. Automated mining of publications can support large-scale molecular interaction maps and database curation. Text mining and Natural-Language-Processing (NLP)-based techniques are finding their applications in mining the biological literature, handling problems such as Named Entity Recognition (NER) and Relationship Extraction (RE). Both rule-based and Machine-Learning (ML)-based NLP approaches have been popular in this context, with multiple research and review articles examining the scope of such models in Biological Literature Mining (BLM). In this review article, we explore self-attention-based models, a special type of Neural-Network (NN)-based architecture that has recently revitalized the field of NLP, applied to biological texts. We cover self-attention models operating either at the sentence level or an abstract level, in the context of molecular interaction extraction, published from 2019 onwards. We conducted a comparative study of the models in terms of their architecture. Moreover, we also discuss some limitations in the field of BLM that identifies opportunities for the extraction of molecular interactions from biological text.

Published in Biomolecules

ISSN: 2218-273X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Microbiology
Website: https://www.mdpi.com/journal/biomolecules

About the journal

Abstract

Keywords