Effective type label-based synergistic representation learning for biomedical event trigger detection

Anran Hao; Haohan Yuan; Siu Cheung Hui; Jian Su

doi:10.1186/s12859-024-05851-1

BMC Bioinformatics (Jul 2024)

Effective type label-based synergistic representation learning for biomedical event trigger detection

Anran Hao,
Haohan Yuan,
Siu Cheung Hui,
Jian Su

Affiliations

Anran Hao: School of Computer Science and Engineering, Nanyang Technological University
Haohan Yuan: School of Computer Science and Engineering, Nanyang Technological University
Siu Cheung Hui: School of Computer Science and Engineering, Nanyang Technological University
Jian Su: Aural & Language Intelligence, Institute for Infocomm Research, Agency for Science, Technology and Research

DOI: https://doi.org/10.1186/s12859-024-05851-1
Journal volume & issue: Vol. 25, no. 1
pp. 1 – 22

Abstract

Read online

Abstract Background Detecting event triggers in biomedical texts, which contain domain knowledge and context-dependent terms, is more challenging than in general-domain texts. Most state-of-the-art models rely mainly on external resources such as linguistic tools and knowledge bases to improve system performance. However, they lack effective mechanisms to obtain semantic clues from label specification and sentence context. Given its success in image classification, label representation learning is a promising approach to enhancing biomedical event trigger detection models by leveraging the rich semantics of pre-defined event type labels. Results In this paper, we propose the Biomedical Label-based Synergistic representation Learning (BioLSL) model, which effectively utilizes event type labels by learning their correlation with trigger words and enriches the representation contextually. The BioLSL model consists of three modules. Firstly, the Domain-specific Joint Encoding module employs a transformer-based, domain-specific pre-trained architecture to jointly encode input sentences and pre-defined event type labels. Secondly, the Label-based Synergistic Representation Learning module learns the semantic relationships between input texts and event type labels, and generates a Label-Trigger Aware Representation (LTAR) and a Label-Context Aware Representation (LCAR) for enhanced semantic representations. Finally, the Trigger Classification module makes structured predictions, where each label is predicted with respect to its neighbours. We conduct experiments on three benchmark BioNLP datasets, namely MLEE, GE09, and GE11, to evaluate our proposed BioLSL model. Results show that BioLSL has achieved state-of-the-art performance, outperforming the baseline models. Conclusions The proposed BioLSL model demonstrates good performance for biomedical event trigger detection without using any external resources. This suggests that label representation learning and context-aware enhancement are promising directions for improving the task. The key enhancement is that BioLSL effectively learns to construct semantic linkages between the event mentions and type labels, which provide the latent information of label-trigger and label-context relationships in biomedical texts. Moreover, additional experiments on BioLSL show that it performs exceptionally well with limited training data under the data-scarce scenarios.

Published in BMC Bioinformatics

ISSN: 1471-2105 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Biology (General)
Website: http://www.biomedcentral.com/bmcbioinformatics/

About the journal

Abstract

Keywords