BMC Bioinformatics (Mar 2023)
B-LBConA: a medical entity disambiguation model based on Bio-LinkBERT and context-aware mechanism
Abstract
Abstract Background The main task of medical entity disambiguation is to link mentions, such as diseases, drugs, or complications, to standard entities in the target knowledge base. To our knowledge, models based on Bidirectional Encoder Representations from Transformers (BERT) have achieved good results in this task. Unfortunately, these models only consider text in the current document, fail to capture dependencies with other documents, and lack sufficient mining of hidden information in contextual texts. Results We propose B-LBConA, which is based on Bio-LinkBERT and context-aware mechanism. Specifically, B-LBConA first utilizes Bio-LinkBERT, which is capable of learning cross-document dependencies, to obtain embedding representations of mentions and candidate entities. Then, cross-attention is used to capture the interaction information of mention-to-entity and entity-to-mention. Finally, B-LBConA incorporates disambiguation clues about the relevance between the mention context and candidate entities via the context-aware mechanism. Conclusions Experiment results on three publicly available datasets, NCBI, ADR and ShARe/CLEF, show that B-LBConA achieves a signifcantly more accurate performance compared with existing models.
Keywords