IEEE Access (Jan 2019)
Improved Distant Supervised Model in Tibetan Relation Extraction Using ELMo and Attention
Abstract
The task of relation extraction is classifying the relations between two entities in a sentence. Distant supervision relation extraction can automatically align entities in texts based on Knowledge Base without labeled training data. For low-resource language relation extraction, such as Tibetan, the main problem is the lack of labeled training data. In this paper, we propose an improved distant supervised relation extraction model based on Piecewise Convolutional Neural Network (PCNN) to expand the Tibetan corpus. We add self-attention mechanism and soft-label method to decrease wrong labels, and use Embeddings from Language Models (ELMo) to solve the semantic ambiguity problem. Meanwhile, according to the Tibetan characteristics, we combine the word vector and part of speech vector to extract deeply feature of words. Finally, the experimental results show that P@avg value increases by 14.4% than baseline.
Keywords