Applied Artificial Intelligence (Dec 2024)
Distance Based Korean WordNet(alias. KorLex) Embedding Model
Abstract
The objective of this study was to create graph embedding vectors using Korean WordNet (KorLex) and apply them to neural network word-embedding models. Semantic knowledge, especially lexical semantic knowledge in a language, can be represented by word-embedding vectors or graph structures of lexical databases, such as WordNet. Both representations capture common semantics; however, some semantic knowledge is only captured in a specific way or not at all. In a previous study, Path2vec mapped WordNet graphs to graph-embedding vectors using similarity scores between two words. In this study, we propose two main approaches. First, we mapped the knowledge in the Korean lexical database KorLex onto graph-embedding vectors. We then applied these embedding vectors to deep neural network word embeddings to capture additional semantic knowledge in the Korean language. On a custom test set, the proposed approach improved performance by capturing additional semantic knowledge in similarity and analogy analyses. We plan to apply a variant of this to other deep neural embedding models.