IEEE Access (Jan 2019)
A Character-Level BiLSTM-CRF Model With Multi-Representations for Chinese Event Detection
Abstract
Using the word as a basic unit may undermine Chinese event detection model’s performance because of the inaccurate word boundaries generated by segmentation tools. Besides, word embeddings are contextual independent and cannot handle the polysemy of event triggers, which may prevent us from obtaining the desired performance. To address these issues, we propose a BiLSTM-CRF (Bidirectional Long Short-Term Memory Conditional Random Field) model using contextualized representations, which regards event detection task as a character-level sequence labeling problem and uses contextualized representations to disambiguate event triggers. Experiments show that our proposed method sets a new state-of-the-art, which proves Chinese characters could replace words for the Chinese event detection task. Besides, using contextualized representation reduces the false positive case, which verifies that this kind of representation could remedy the weakness of the word embedding technique. Based on the results, we believe that character-level models are worth exploring in the future.
Keywords