Journal of Intelligent Systems (Dec 2023)
Biomedical event extraction using pre-trained SciBERT
Abstract
Biomedical event extraction is applied to biomedical texts to obtain a list of events within the biomedical domain. The best GENIA biomedical event extraction research uses sequence labeling techniques with a joint approach, softmax decoder for event trigger identification, and the BioBERT v1.1 encoder. However, this event extraction model has three drawbacks: tasks are carried out independently, it does not provide special handling of multi-label event trigger labels, and it uses an encoder with vocabulary from non-biomedical domains. We propose to use the pipeline approach to provide forward information sigmoid to address multi-label event trigger labels and alternative BERT encoders with vocabulary from the biomedical domain. The experiment showed that the performance of the biomedical event extraction model increased after changing the encoder, which had been built using a biomedical-specific domain vocabulary. Changing the encoder to SciBERT while still using the joint approach and softmax decoder increased the precision by 4.22 points (reaching 69.88) and resulted in an F1-score of 58.48.
Keywords