Semantic Similarity Estimation Using Vector Symbolic Architectures

Job Isaias Quiroz-Mercado; Ricardo Barron-Fernandez; Marco Antonio Ramirez-Salinas

doi:10.1109/ACCESS.2020.3001765

IEEE Access (Jan 2020)

Semantic Similarity Estimation Using Vector Symbolic Architectures

Job Isaias Quiroz-Mercado,
Ricardo Barron-Fernandez,
Marco Antonio Ramirez-Salinas

Affiliations

Job Isaias Quiroz-Mercado: ORCiD; Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico City, Mexico
Ricardo Barron-Fernandez: Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico City, Mexico
Marco Antonio Ramirez-Salinas: Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico City, Mexico

DOI: https://doi.org/10.1109/ACCESS.2020.3001765
Journal volume & issue: Vol. 8
pp. 109120 – 109132

Abstract

Read online

For many natural language processing applications, estimating similarity and relatedness between words are key tasks that serve as the basis for classification and generalization. Currently, vector semantic models (VSM) have become a fundamental language modeling tool. VSMs represent words as points in a high-dimensional space and follow the distributional hypothesis of meaning, which assumes that semantic similarity is related to the context. In this paper, we propose a model whose representations are based on the semantic features associated with a concept within the ConceptNet knowledge graph. The proposed model is based on a vector symbolic architecture framework, which defines a set of arithmetic operations to encode the semantic features within a single high-dimensional vector. In addition to word distribution, these vector representations consider several types of information. Moreover, owing to the properties of high-dimensional spaces, they have the additional advantage of being interpretable. We analyze the model's performance on the SimLex-999 dataset, a dataset where commonly used distributional models (e.g., word2vec or GloVe) perform poorly. Our results are similar to those of other hybrid models, and they surpass several state-of-the-art distributional and knowledge-based models.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords