Complex & Intelligent Systems (Apr 2023)
A bilateral context and filtering strategy-based approach to Chinese entity synonym set expansion
Abstract
Abstract Entity synonyms play a significant role in entity-based tasks. Previous approaches use linguistic syntax, distributional, and semantic features to expand entity synonym sets from text corpora. Due to the flexibility and complexity of the Chinese language expression, the aforementioned approaches are still difficult to expand entity synonym sets robustly from Chinese text, because these approaches fail to track holistic semantics among entities and suffer from error propagation. This paper introduces an approach for expanding Chinese entity synonym sets based on bilateral context and filtering strategy. Specifically, the approach consists of two novel components. First, a bilateral-context-based Siamese network classifier is proposed to determine whether a new entity should be inserted into the existing entity synonym set. The classifier tracks the holistic semantics of bilateral contexts and is capable of imposing soft holistic semantic constraints to improve synonym prediction. Second, a filtering-strategy-based set expansion algorithm is presented to generate Chinese entity synonym sets. The filtering strategy enhances semantic and domain consistencies to filter out wrong synonym entities, thereby mitigating error propagation. Experimental results on two Chinese real-world datasets demonstrate that the proposed approach is effective and outperforms the selected existing state-of-the-art approaches to the Chinese entity synonym set expansion task.
Keywords