Mirroring Vector Space Embedding for New Words

Jihye Kim; Ok-Ran Jeong

doi:10.1109/ACCESS.2021.3096238

IEEE Access (Jan 2021)

Mirroring Vector Space Embedding for New Words

Jihye Kim,
Ok-Ran Jeong

Affiliations

Jihye Kim: ORCiD; Department of AI Software, Gachon University, Seongnam-si, Gyeonggi-do, South Korea
Ok-Ran Jeong: Department of AI Software, Gachon University, Seongnam-si, Gyeonggi-do, South Korea

DOI: https://doi.org/10.1109/ACCESS.2021.3096238
Journal volume & issue: Vol. 9
pp. 99954 – 99967

Abstract

Read online

Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire model whenever some new words emerge. Moreover, since a language has a huge number of words and its characteristics change continuously over time, it is not easy to embed all words. To solve both problems, we propose a new embedding model, the Mirroring Vector Space (MVS), which enables us to obtain a new word embedding by using the previously built word embedding model without retraining it. The MVS embedding model has a convolutional neural networks (CNN) structure and presents a novel strategy to obtain word embeddings. It predicts the embedding value of a word by learning the vector space of an existing embedding model using the explanations of the word. It also provides flexibility for external resources, reusability for training times, and portability in the point that our model can be used with any models. We verify these three attributes and the novelty in our experiments.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords